Published 2017-04-29.
Time to read: 4 minutes.
This is a high-level proposal, not documentation for any specific system that exists today. I would be happy to discuss possible implementation details with interested parties, including the creation of a prototype.
Intents and Subgoals
Some of today’s interactive voice response systems, such as AWS’s Lex, Amazon’s Alexa, and Google’s Dialogflow, use intents as a means of gathering the required parameters to fulfill a voice command. The next big step in voice interfaces to computation would be the ability to have an exploratory conversation, where one or more subgoals may or may not emerge.
Subgoals might be fulfilled by intents. Continuous tracking of viable subgoals could be accomplished by a constraint-based solver that uses rules to participate in an exploratory conversation by recognizing and prioritizing potential subgoals, thereby activating and deactivating intents during the conversation. Once a user confirms that a potential subgoal is desirable, the system might consider that goal to be the only goal worth pursuing, or it might continuously elicit more information from the user regarding other potential subgoals. User-supplied information might be applied to a model associated with a subgoal/intent, and/or it might be retained as a user preference.
Anthropomorphism
If an AI system is a good conversationalist, even if it has no physical presence, people will anthropomorphically attribute the system with human traits, emotions, and intentions. This should not be discouraged. Rather, AI systems should aspirationally model our higher selves. The movie Her explored this in a charming way.
Breaking It Down
An interactive voice response system recognizes an intent by a key word or phrase uttered by the user.
For example, if the user says: “Computer, order dinner”, and the system was previously ‘taught’ to
understand the word “order” as a key word for launching the OrderDinner
intent,
the system would then elicit the kind of food the user wanted, how many people would be eating, etc,
and then order the desired dinner.
Anthropomorphically: if intents are the only programming technique employed, they present interactive voice systems as slaves.
Intents are useful for recognizing commands and gathering enough associated data to carry out the command. Intents are not useful for open-ended conversations where there is no explicit goal, or potential subgoals emerge as the result of a dialog. Once a subgoal is identified and confirmed, however, processing an intent is an efficient mechanism for fulfilling the emergent subgoal.
Designers of chatbots and video games are familiar with goal-seeking behavior. Exploratory conversation is how people interact with each other IRL (in real life). Each individual uses a variety of strategies for initiating exploratory conversation, and participating in it. For example, small talk at a gathering is a useful skill for getting to know other people, and a socially adept practitioner of small talk can innocuously gather a lot of information in this way. Another strategy is to make a controversial statement, and use the resulting banter to learn about the conversationalists and the possible subgoals that they might request. A wide variety of such strategies exist, and could be utilized by conversational systems.
Anthropomorphically: with exploratory conversation capability, interactive voice systems can be presented socially as co-operative entities.
Agenda and Strategy
Unlike people, computers can only do a finite number of tasks, and voice recognition systems are programmed with a finite number of intents. I define the potential agenda for a chatbot or video game to be the entire scope of its pre-programmed intents. Agendas may be fully disclosed, or they might be obvious, or they might be unveiled over time. Ethical considerations might apply to the design and implementation of conversational AI systems.
Users should be apprised of the agenda of every autonomous computer entity they encounter. To mitigate potential problems, an industrial code of conduct should be established. The Europe Union will likely be the first government to require published standards and possibly a certification process.
I define strategy to mean the autonomous modification to goal-seeking behavior when a criterion is met. For example, an insurance chatbot might begin to solicit sensitive information only after it has reason to believe that a modicum of trust has been established with the prospect. How a strategy is executed is quite significant; by this I mean the degree of diplomacy. Some circumstances might sacrifice diplomacy to save lives, while under normal circumstances the AI entity should treat everyone with respect and kindness. Again, the Europe Union is likely to be the first government to require published standards and possibly a certification process.
Update May 5, 2017
Today, a week after I first published this articleing, I received the following email from
api.ai
.
Two points worth mentioning:
- I have no apps running on api.ai that use their Small Talk feature.
- The Small Talk feature, even after the changes mentioned below, does not approach the capability I proposed above.
Nonetheless, it was great to get the information.
Hi Michael,
We noticed you are using the Small Talk customization feature in your agent, and that's why we are getting in touch.
The Small Talk customization has been updated.
The new version will be even faster and more accurate. It also includes some additional intents that you can customize. The update also includes changes in actions and parameters returned with the responses.
If you use them in your business logic, please review the changes here. If you’re ready, you can apply change right now here. Otherwise, the changes will be applied automatically on May 29, 2017.
If you’d like a more flexible way to implement small talk, then get your source agent and implement the use cases you care about in your timeline.
Regards,
API.AI
Team
About the Author
Mike Slinn has been pursuing EmpathyWorks™ original AI research (augmenting machine learning with simulations and event-driven architecture) since 2007.