Making Alexa learn Hindi: ‘We built a rich vocabulary to understand context’
Four years is a long time. That is how long Amazon’s Alexa has been in India for, and while not the first, it did bring voice-based virtual assants into the reckoning for a much larger demographic of users. Something Google Assant had managed to a limited extent because of the massive Android phone user base, but Apple Siri and Microsoft’s now largely abandoned Cortana could not. These assants had considerable platforms – Siri gets Apple’s device ecosystem including iPhone, while Cortana was front and centre on millions of Windows PCs for years. In an exclusive conversation with HT, Puneesh Kumar, who is country manager for Alexa, Amazon India, says there has been a 350% increase in active users in India over the last two years. “In 2021 particularly, the number of requests to Alexa grew about 68%,” he says. He shares more numbers. Every day, Alexa users demand 21.6 lakh songs, there are 1.7 lakh queries including about cricket, 8.6 lakh request within the Amazon Shopping app and 2.6 lakh smart home control commands sent out. That’s every day. The Mobile Marketing Association (MMA) and digital agency Isobar estimates released in 2021 indicate voice search queries are growing at 270% per year in India. In 2020, the Dentsu Aegis Network India’s Recogn had released the Voice Technology in India report, which indicated users think Google Assant is smart with replies, but Alexa is more reliable including with managing smart home devices. Making space in classrooms Things have gone much beyond music in the last four years in India. Education has seen ingenious uses of voice assants. More so in smaller towns, where children don’t always have access to the same tools and teaching quality, as in larger cities. Kumar talks about unique instances where Alexa was deployed in schools in India, in smaller towns. That got Amazon thinking. “We contemplated how can we use Alexa in schools in remote areas,” says Kumar. Also Read: Taking stock of India’s satellite broadband map as Jio enters the sceneA schoolteacher in the Warud drict of Maharashtra, Amol Bhuyar, bought an Echo smart speaker and then with the help of the school headmress, gathered enough funds to make a mannequin inside which the Echo was installed. It ran on a power bank and connected to the internet using a mobile hotspot. The mannequin, dressed in jeans and a striped black and white T-shirt, stands a bit taller than most students. The result – students developed fluency in English (the teacher had configured the Echo to len and converse in English), something that may not have been easy otherwise. It opened a world of conversations about subjects including hory, geography, general knowledge, and science. “Interacting with Alexa in English was helping them improve their conversation skills. Many of these schools did not have an English teacher. This also increased in kids having more confidence in asking questions,” says Kumar, talking about how Amazon worked with the Bastar Drict adminration to add Echo smart speakers in classrooms in 40 schools in the tribal belt of Lohandiguda. The example set Bhuyar, the schoolteacher from Warud, was also implemented last year a primary school in Cheriyakkara in the Kasaragod drict of Kerala. This mannequin is the same height as most children in the class and wears the school dress too – a smart green kurta with Chinese collar, and white trousers. Kumar says teachers often say children feel shy asking certain questions, but they don’t hesitate asking Alexa about those matters. It’s that feeling of friendship fostering, between technology and children. India was one of Alexa’s first global markets The last four years have seen Alexa evolve tremendously, as a voice assant, in the Indian context. “When we first launched Alexa in India, we had only launched in the US, UK and Germany. That too with English. For us, it was all around smart control, smart speakers, nascent category, and early adopters,” Kumar points out. Amazon had a first-mover advantage in many respects. While Assant was part of Android phones, it was only after Amazon’s aggressive push with Alexa in 2018 and beyond, did Google react with the Nest smart speakers. Apple responded much later with Siri localisation for India. Music remains a key element for voice assants and early adoption. Most popular is kids’ music (40% of total music requests every day), followed devotional (25%), regional (15%) while Bollywood and international music make up the rest of the numbers. While Google Assant (and Apple Siri to a certain extent) still have larger recollection attached with phones, Amazon never had that limitation to conquer. Alexa in homes is more of a social tech experience, unlike a phone, for example, which is likely to be an individual device. Every member can interact with Alexa, in their own way, which is available in smart devices. Four years on: Expanding multilingual prowess India has driven the language evolution for Alexa. First was the understanding of English and Hindi, which we often revert to – this is something Amazon’s data also indicates. “In the initial days, you had to go to settings, you had to change your language, whereas now everything is seamless,” says Kumar, talking about Alexa’s ‘Hinglish’ understanding skills. How complicated was it to get Alexa to learn Hindi? “It was actually quite difficult. Because as we classify languages, Hindi comes in the form of a complex language,” Kumar points out. The challenge was with artificial intelligence (AI) processing the language. Amazon used a skill called Cleo and asked customers to help teach Alexa the Hindi language – thousands of Echo users answered the call. “How do you say, ‘Weather kya hai’? Will you say, ‘Taapmaan kya hai’? or ‘Mausam kaisa hai’? With AI, Alexa started understanding that all of this means “what’s the weather”. We did support for Hindi and English building a bilingual model,” he points out. The deep neural networks and multi-dialect training improved Alexa’s understanding of Hindi. “We also did build a rich vocabulary to understand context,” Kumar adds. Amazon’s engineers found that unlike Latin script, Devanagari had the advantage of phonetically consent representation of words. Translation was done for millions of pieces of content, including the music and video catalogue. “Our teams also worked on the cultural and sub-cultural context within Hindi, especially in Uttar Pradesh and other parts of India where Hindi is used more, just so that the context switching was much clearer,” he says. More than 60% of Alexa users in India have set it to the multilingual mode, rather than just English or Hindi, for example. Amazon gets thousands of requests made to Alexa everyday where people are saying “Alexa, mujhe joote dikhao” (Alexa, show me some shoes), “Mera samaan kahan hai?” (Alexa, where is my stuff?), even “What are the cashbacks on my bill” and “Meri car lock kardo” (lock my car). This is perhaps the best illustration of how Alexa’s use cases have expanded, to include shopping, smart mobility, and digital payments. False positives and negatives still worry scientsThe weakest link that voice assants must deal with, is the often-limited patience level of the human in the equation. If you have interacted with Alexa, Siri, or Google Assant, you may have noticed that often, they may not immediately understand certain words or names. “The problem is, the word, phonetically can get misunderstood with other names that may be very similar sounding. We’ve tried a lot to look at false rejects and false accepts,” admits Kumar. He says that from the standpoint of natural language understanding (NLU), they have been able to reduce false accepts and rejects 25%. Kumar believes that the more people use Alexa, the more training data they can work with. “With more people, we get millions and millions of interactions every week, and all of that is helping Alexa get smarter in understanding when people are probably not talking to her or not using a name or what they mean in a particular context,” he says. Privacy and voice assants: Fears and controls There is lingering concern about voice assants, around the data they collect, how much of our conversations they hear and how our data is stored. Amazon has often talked about how privacy controls are available for users that allows them control over their voice hory and data. There is now a hub in the Privacy Settings in the Alexa app on your phone. The options include the ability to hear and delete voice recordings. “You can set how long the voice recording should be stored, and you have a lot of control over what you can do with your own data,” says Kumar. Echo devices also have manual microphone off button as well – this electronically disconnects the mic. Alexa from the device standpoint: Undeniable advantage Alexa, as an assant, will only be as good as the device it is a part of. That is where the ecosystem gives it a big advantage. Apple’s Siri is available on the iPhone, iPad, Mac and the HomePod smart speakers, for instance. Google Assant still relies largely on the Android phone user base for numbers, with availability also on the Nest smart speaker line-up and Android TV. For Alexa, the canvas is much wider. Echo line-up of smart speakers, Echo Show smart displays, Fire TV devices, the Amazon app on Android phones and a long l of third-party brands that have Alexa built-in and Works With Alexa labelling on certain products — these include laptops made HP, TVs OnePlus, Samsung and LG, smartwatches Amazfit, Boat and Noise, audio products Bang & Olufsen, Marshall and Bose as well as smart home products Qubo (security cameras), Ecobee (smart switches) and Syska (smart plugs). Amazon’s sales data says the Fire TV Sticks and the Echo Dot are the best-selling in the category. “We’ve also seen an overlap of customers between Fire TV and Echo speakers. There are bundles, so people are buying Alexa device, but they are also getting a smart bulb. And now that they have got the smart bulb, they are trying to control it from different rooms or they are trying to build a smart home around it,” observes Kumar.
ABOUT THE AUTHOR
Vishal Mathur is Technology Editor for Hindustan Times. When not making sense of technology, he often searches for an elusive analog space in a digital world.
…view detail