On stage at re:Mars this week, Amazon showed off an in-development Alexa feature meant to mimic the flow of natural language. The conversation between two humans rarely follows a predefined structure. He goes to strange and unexpected places. One topic segues into another as participants inject their lived experience.
In one demo, a conversation about trees turns into a conversation about hiking and parks. In the context of the company’s AI, Alexa senior vice president and chief science officer Rohit Prasad calls the phenomenon “conversational mining.” It’s not a proper name for proper functionality, exactly. There’s no flipping a switch to suddenly activate conversations overnight. Rather, it’s part of an evolving notion of how Alexa can interact with users in a more human – or perhaps more human – way.
Smart assistants like Alexa have traditionally provided a much more simplistic question-and-answer model. Ask Alexa for the weather, and Alexa tells you the weather in a predetermined area. Ask her for the A’s score (or, honestly, probably don’t), and Alexa will tell you the A’s score. It’s a simple interaction, akin to typing a question into a search engine. research. But, again, real-world conversations rarely turn out that way.
“There’s a whole series of questions that Alexa gets that are very informative. When those questions come up, you can imagine they’re not one-time questions,” Prasad told TechCrunch in a chat during “It’s really something the customer wants to know more about. What we’re concerned about right now is what’s happening with inflation. We’re getting a ton of inquiries to Alexa like that, and it gives you that kind of exploration experience.
Such conversational features, however, are how a home assistant like Alexa gets started. Eight years after being launched by Amazon, the assistant is still learning, collecting data and determining the best ways to interact with consumers. Even when something gets to the point where Amazon is ready to show it on a preliminary stage, adjustments are still needed.
“Alexa has to be an expert on many topics,” Prasad explained. “It’s the big paradigm shift, and that kind of expertise takes time to achieve. It’s going to be a journey, and with our customer interactions, it won’t be like Alexa knows everything from day one. But these questions can evolve into other explorations where you end up doing something you didn’t think you were.
Seeing the word “Empathy” in big, bold letters on the stage behind Prasad turned heads – but maybe not as much as what came next.
There are simple scenarios where the concept of empathy could or should be considered when conversing with humans and intelligent assistants. Take, for example, the ability to read social cues. It is a skill we acquire through experience – the ability to read the sometimes subtle language of faces and bodies. Emotional intelligence for Alexa is something Prasad has been discussing for years. It starts with changing the tone of the assistant to respond in a way that expresses happiness or disappointment.
The other side of the coin is determining the emotion of a human speaker, a concept the company has been working to perfect for several years. It’s work that has manifested itself in a variety of ways, including the 2020 debut of the company’s controversial Halo wearable, which features a feature called Tone that claimed to “analyze the energy and positivity in the voice of a client so he can understand how he sounds to others and improve his communication and relationships. »
“I think empathy and affect are well-known ways to interact, in terms of building relationships,” Prasad said. “Alexa can’t be deaf to your emotional state. If you walked in and you’re not in a good mood, it’s hard to say what you should do. Someone who knows you well will react differently. That’s a very high bar for AI, but it’s something you can’t ignore.
The executive notes that Alexa has already become something of a companion for some users, especially older ones. A more conversational approach would probably only improve this phenomenon. In Astro’s demos this week, the company has frequently referred to the home robot as performing an almost pet-like function in the home. However, such notions have their limits.
“That shouldn’t hide the fact that it’s an AI,” Prasad added. “When it comes to the essential [where] it’s indistinguishable – which we are very far from – it should still be very transparent.
A later video demonstrated impressive new text-to-speech technology that uses as little as one minute of audio to create a convincing approximation of a person speaking. In it, the voice of a grandmother reads her grandson “The Wizard of Oz”. The idea of memorializing loved ones through machine learning isn’t entirely new. Companies like MyHeritage use the technology to animate images of deceased relatives, for example. But these scenarios invariably – and understandably – raise issues.
Prasad was quick to point out that the demo was more of a proof of concept, highlighting the underlying voice technologies.
“It was more about technology,” he explained. “We are a very customer-obsessed science-based company. We want our science to mean something to customers. Unlike a lot of things where build and synthesis were used without the right gates, this looks like what customers would like. We need to give them the right set of controls, including who owns the voice. »
With that in mind, there’s no timeline for such a feature – if, indeed, such a feature will ever exist on Alexa. However, the executive notes that the technology that would power it is fully operational in Amazon labs. However, again, if this happened, it would require some of the aforementioned transparency.
“Unlike deepfakes, if you’re transparent about how it’s being used and there’s a clear decision-maker and the customer is in control of their data and what they want it to be used for, I think that’s the right set of steps,” Prasad explained. “It wasn’t about ‘dead grandmother’. Grandma is alive in this one, just to be very clear about that.
Asked what Alexa might look like in 10 to 15 years, Prasad explains that it’s all about choice, although it’s less about imbuing Alexa with individual and unique personalities than offering a computing platform flexible to users.
“It should be able to accomplish anything you want,” he said. “It’s not just by voice; it’s intelligence at the right time, that’s where ambient intelligence comes in. She should proactively help you in some cases and anticipate your need. This is where we go deeper into conversational exploration. Everything you’re looking for – imagine how much time you spend booking a holiday [when you don’t] have a travel agency. Imagine how much time you spend buying the camera or TV you want. Anything that requires you to spend time searching should become much faster.