Amazon’s New Alexa Answers Crowdsourcing Platform Risks Abuse

Did Albert Einstein wear socks? How do you prevent tears when cutting an onion? Did Burt Reynolds marry Sally Field? What makes wasabi green? The average person might not know the answer to these questions, but Amazon Alexa, through the new Alexa Answers portal that was announced Thursday, might. Well, more accurately, an Alexa user could.

An online community where anyone who logs in can suggest answers to user-supplied questions posed to the voice-activated Alexa A.I. assistant, Alexa Answers is designed to answer the tough questions that can’t already be answered by the voice-enabled assistant. Once the answers are submitted, they are vetted for accuracy, scored, and if they are good enough, make their way back to Alexa users.

But is crowdsourcing Alexa’s smarts a good idea? From a Microsoft chatbot subverted by racist trolls to Yahoo Answers, a similar service to Alexa Answers that has become notoriously rife with bad information, the past few years have been littered with cases of user-generated data systems gone bad. So it’s not hard to imagine the worst-case scenario: an Alexa-backed smart speaker blithely spouting fake news, dangerous conspiracy theories, or white supremacist talking points.

Describing Alexa Answers to Fast Company, Bill Barton, Amazon’s Vice President of Alexa Information, struck an optimistic tone. “We’re leaning into the positive energy and good faith of the contributors,” he said. “And we use machine learning and algorithms to weed out the noisy few, the bad few.”

Experts on data use and its impacts are markedly less cheery.

“We have plenty of examples of why this is not going to play out well,” says Dr. Chris Gillard, who studies the data policies of Amazon and other tech companies at Macomb Community College near Detroit. Crowdsourcing data, and then using that data in training the Alexa algorithm, he says, presents “pitfalls that Amazon seem intent on stepping right into.”

The race to beat Google

Better assistants and smart speakers drive sales of accessories like voice-activated lights. But Google’s decades in the search business seem to have given it an advantage over Amazon when it comes to understanding queries and returning data. Google’s smart speaker has steadily gained market share against the Echo, and Google Assistant has almost uniformly outperformed Alexa in comparison tests.

In fact, almost all of the questions above, from Einstein’s socks to wasabi’s color, appeared in need of answering on Amazon Answers, but can be answered now by Google Assistant. Google’s answers come from its search engine’s results, featured snippets, and knowledge graph. Amazon is trying to use crowd-supplied answers to catch up in this space.

“Amazon’s not Google,” says Dr. Nicholas Agar, a technology ethicist at Victoria University of Wellington, New Zealand. “They don’t have Google’s [data] power, so they need us.”

Beyond just providing missing answers to individual questions, data from Alexa Answers will be used to further train the artificial intelligence systems behind the voice assistant. “Alexa Answers is not only another way to expand Alexa’s knowledge,” an Amazon spokesperson tells Fortune, “but also… makes her more helpful and informative for other customers.” In its initial announcement of Alexa Answers, Amazon referred to this as Alexa “getting smarter.”

Money for nothing, facts for free

As important as Alexa Answers might be for Amazon, contributors won’t get any financial compensation for helping out. The system will have human editors who are presumably paid for their work, but contributed answers will be rewarded only through a system of points and ranks, a practice known in industry parlance as ‘gamification.’

Agar believes this will be effective, because Amazon is leveraging people’s natural helpfulness. But he also thinks a corporation leveraging those instincts should give us pause. “There’s a difference between the casual inquiry of a human being, and Amazon relying on those answers,” he says. “I think it’s an ethical red flag.”

Gillard also thinks Amazon should pay people to provide answers, whether its one of its own workers or partner with an established fact-checking group.

Amazon certainly has the infrastructure to do it. The ecommerce giant already runs Mechanical Turk, a ‘gig’ platform that pays “Turkers” for performing small, repetitive tasks, and would seem well-suited to supplementing Alexa’s training.

But Gillard believes that relying on a ‘community’ model insulates Amazon if Alexa starts spouting bad or offensive answers, based on crowd input. “I think not paying people lets you say, well, it was sort of the wisdom of the crowd,” he says. “If you pay people, you’re going to be accused of bias.”

A gamified incentive system, though, is not without its own risk. In 2013, Yahoo Answers disabled part of its user voting system. That’s allegedly because some participants created fake accounts to upvote their own (not necessarily accurate) answers. (Source: Quora. Also, this is a good example of how crowd-sourcing information impacts reliability.)

Troll stoppers

The biggest question facing Alexa Answers is whether Amazon can effectively prevent abuse of its new platform. Amazon declined to answer questions from Fortune about the precise role of human editors in the system. But their presence alone represents an acceptance that automated systems in their current state can’t reliably detect offensive content, or evaluate the accuracy of facts.

Amazon has never grappled with these challenges as directly as companies like Facebook and Twitter, though according to some critics, it has failed even to consistently detect fake reviews in its own store. Barton told Fast Company that Amazon will try to keep political questions out of the system, a subtle task Gillard says will likely fall to humans. “A.I. can’t do those things,” he says, “It can’t do context.”

Automated systems can easily detect and block individual offensive terms, but even that has its downsides. In a test, this reporter attempted to reference the ‘90s rock band Porno for Pyros when suggesting an Alexa Answer. The answer was rejected, not because of inaccuracy, but because of the word ‘porno.’ According to a notification, “Alexa wouldn’t say that.”

Not everything has an answer

Barton told Fast Company that “we’d love it if Alexa can answer any question people ask her,” but that’s clearly impossible. Alexa cannot be expected to know, for instance, what the meaning of life is, and crowdsourcing answers to questions that are enigmas could make the entire system more fragile. In a 2018 study, researchers found that search queries with limited relevant data, which they called “data voids,” were easier for malicious actors to spoof with fake or misleading results.

And trolls aren’t the only risk to Alexa’s mental hygiene. Even well-intentioned questions can wind up nonsensical, if Alexa doesn’t properly interpret the questioner’s speech. For example, the question “What is a piglet titus?” appeared on Alexa Answers Friday morning. It seems likely the user actually asked “What is Epiglottitis?” (Answer: a rare throat condition). If enough users tried to answer the nonsense question—perhaps Winnie the Pooh fans, or users hungry for points—it could muddy the data pool, instead of improving it.

It’s unclear how Alexa’s overall performance might be impacted by messy or malicious data—those answers are a ways away yet. Bit it’s a wonder if, after all the stumbles of similar systems, Amazon is taking the risks of crowdsourced answers seriously.