“Alexa, can you tell me the impact of the wholesale shift to voice search and voice communication over the Internet?”
Amazon’s wildly popular personal assistant, Alexa, probably cannot answer that question for you. And even she doesn’t perceive how she is making us dumber and taking our choices away.
The world is surely moving from text to voice as the primary interface on the Internet. The rapid rise of Amazon’s Echo (and its smaller version, the Echo Dot) personal assistant device was the biggest story of the 2016 holiday shopping season. As of September 2017, Amazon had sold 15 million Echos and Google had sold five million of its own personal assistant device, the Google Home. This is impressive, for a category that just a year earlier had not existed.
Even more significantly, we are switching to voice as the means of communicating with our smartphones. More than 20% of mobile searches were conducted via voice in 2016, according to Google: a roughly a 35-fold increase in voice search since 2008. Google also found that about two-thirds of its users conduct voice search via mobile phone several times per day and that roughly half of its users use voice and text search interchangeably.
Such growth has been enabled by dramatic improvements in voice recognition, through use of powerful artificial intelligence systems that utilize machine learning. We are now in a positive feedback loop for voice: As more people talk to their smartphones or home assistants, more data become available to companies such as Amazon, Google, and Apple to feed to their personal assistant systems. As of May 2017, Google’s speech recognition error rate was 4.9%, down from 23% in 2013.
Businesses have recognized the shift in accuracy and customer engagement, and are piling in. Amazon now boasts more than 15,000 Alexa “skills,” which are capabilities that allow customers to make personalized requests. For example, travel search providers let you plan vacations via Alexa using voice commands; Pizza Hut lets you order pizza; Nissan and Hyundai let Alexa owners start their cars’ engines and set their temperatures; Capital One lets customers check their bank balances; and Campbell Soup Company supplies recipe ideas.
The shift to voice search and voice communication will surely make many things more convenient for us, but will dramatically reduce our online choices. The reason for this is simple: When results are spoken back to us, we will receive only a few options, because humans cannot absorb 10 results in succession and adequately choose between them. We can’t remember them all. This switch in information density has profound implications, and voice search can subvert our purchasing choices in subtle ways.
Prior to the advent of the Internet, when we looked at the yellow pages, we had many pages of options. When we searched online, we had even more options but tended to only react to those on the first page. Increasingly, those first-page results are sold to the highest bidder. On mobile phones, the searches mean even fewer options, and the paid ones utterly dominate the screen.
In the results of a voice search, we are usually down to only two or three options. People just can’t remember more information presented to them vocally. So your search for “best hotel in San Francisco” will yield only a few results. The response to “I want to find a pizza place in Palo Alto,” might not show the pizza joint that is the best in town, because it has not bought its spot in the search results.
Most worryingly, this will further consolidate power in the hands of the big providers, such as Amazon, Google, and Apple.
When we ask Alexa to add olive oil to our shopping cart, we are ceding our choice to Amazon. Maybe we prefer Californian olive oil, because we know it is less likely to be adulterated. Or maybe we would rather buy the lower-priced of two favorite brands. With voice, which olive oil goes into the cart becomes Amazon’s decision. Unsurprisingly, research firm L2 found that Amazon is more likely to put its own proprietary label products into your shopping cart.
In theory, we could ask for more voice results to get richer searches. Or perhaps voice assistant systems will eventually be improved to include capacities such as following up to ask us whether we want, for example, a particular type of pizza.
But even if that happens, the world of voice is taking us back a century in terms of information density. Talking to a voice assistant is a lot like asking a friend for restaurant recommendations, except that friend is a giant technology company that makes its money from the recommendations it provides us. That doesn’t sound very friendly.
Vivek Wadhwa is a distinguished fellow at Carnegie Mellon University’s College of Engineering and Alex Salkever is an author, public speaker, and former vice president of marketing at Mozilla. Together they authored The Driver in the Driverless Car: How Our Technology Choices Will Create the Future.