Facebook creates most 'human' chatbot yet

Chatbots have enormous potential applications in customer service and sales. And in the battle to build a better chatbot, Facebook just scored a big win.

Researchers at the social media giant have created a chatbot that can hold extended, open-ended conversations that are more humanlike than any existing software, the company said Wednesday.

“This is the first time a chatbot has learned to blend key conversational skills—including the ability to assume a persona, discuss nearly any topic, and show empathy,” Facebook said in a blog post.

In fact, judges, who were recruited through Amazon’s Mechanical Turk service for freelancers, said they liked talking to Facebook’s chatbot almost as much speaking to a real human being, with evaluators saying they preferred short conversations with the chatbot 49% of the time when pitted against a similar human-to-human dialogue. “This indicates that we are pretty close to human-level performance,” Stephen Roller, one of the Facebook engineers who worked on the project, said.

Most commercially available chatbots, such as familiar digital assistants like Amazon’s Alexa or Apple’s Siri, are designed to be adept at dialogues around a set of specific tasks: telling you the weather forecast or giving you directions to the nearest post office. These are the sorts of “skills” Amazon is constantly adding to Alexa, for instance.

The type of chatbot the Facebook researchers built is different. Called an “open domain chatbot” it is designed to be able to hold a conversation on any topic. “It can talk to you about literally anything—whether that is what you had for breakfast this morning or what cereals are healthiest to feed your kids to your favorite sports team,” Roller said.

The chatbot, which Facebook calls Blender because of the way it can “blend” various skills needed for successful conversations, was also tested against dialogues produced by a previous record-holding chatbot called Meena that was created earlier this year by Facebook’s cross–Silicon Valley rival Google.

Blender blew Meena out of the water, with 67% of evaluators judging Blender more human-sounding and 75% saying they’d rather have a long conversation with Blender than Meena.

Although powerful chatbots have obvious commercial applications, Facebook said it had no immediate plans to turn Blender into a product. The company does, however, already have chatbot interfaces that third parties can use inside its Messenger messaging application, and its WhatsApp messaging platform also allows businesses to use chatbots. “This is purely research at this point,” Emily Dinan, another research engineer who worked on the project, said. “We are not currently looking to productionize it.”

Facebook’s researchers cautioned that Blender still “has many weaknesses compared to humans,” including instances in which it contradicts prior statements, repeats itself or even invents factually inaccurate information, issues that are likely to become more apparent the longer a conversation lasts. The benchmark studies were conducted using dialogues that consisted of 14 “turns,” or back-and-forths between interlocutors.

The chatbot can remember information only stretching back several conversational turns, and so is more likely to repeat itself in longer conversations, Dinan said.

The Blender chatbot has also been trained to operate in English alone so far, and Dinan acknowledged that other languages may present more challenges to creating a chatbot that can navigate the appropriate use of formal and informal tenses and honorifics with the same fluency as humans.

Most recent breakthroughs in natural language processing—the kind of A.I. that can analyze and manipulate language—have been the result of using algorithms that ingest massive amounts of data about relationships between words and training them on very large sets of text examples.

The new Facebook chatbot is no exception. It uses an algorithm that can look at 9 billion variables. It is so large that the neural network, a kind of artificial intelligence software loosely modeled on the way the human brain works, cannot fit on a single computing device. Instead its workload has to be distributed among several machines that process information in parallel. (The company also created a smaller version, that takes in 2.7 billion variables.)

The chatbot—which uses a software design first pioneered by Google in 2017—was also trained on a very large number of examples. In this case, Facebook used 1.5 billion examples of dialogue taken from Reddit.com discussion groups to provide its algorithm with an initial grounding in how conversational language works.

But crucially, Facebook says that its real innovation was then fine-tuning the software on four smaller data sets. One, called the Wizard of Wikipedia, trains the chatbot to convey factual information from the online encyclopedia, display expert knowledge, and answer specific factual questions. Another, called PersonaChat, teaches the algorithm how to emulate a certain character and add information about that character’s personal background into a dialogue. A third module, called Empathetic Dialogues, as the name suggests, helps the algorithm learn how to recognize emotions and respond empathetically.

Having been trained on each of these models individually, the chatbot’s abilities are then perfected using a new data set dubbed Blended Skills Training, that teaches it to integrate all three skills from the previous training. This enables the chatbot to recognize and adapt to changes in its human interlocutor’s tone, such as switching from joking to serious. It also learns when it’s most appropriate to mention that Little Rock is the capital of Arkansas and when it would be better to talk about how much it likes German shepherds or say how sorry it is that your goldfish died.

Previously, tech companies have sometimes run into trouble when making open domain chatbots publicly available. Microsoft researchers released an open domain chatbot called Tay in 2016. The researchers thought the chatbot would perfect its conversational skills through interaction with users. Instead, users soon succeeded in teaching the chatbot to make racist remarks.

Dinan said that with Blender, the researchers were well aware that it might have learned racist or sexist conversations, particularly from its pre-training on Reddit dialogues. But she said that the researchers found fine-tuning the process on the Blended Skill Talk data reduced the risk of the chatbot making offensive comments. She said that Facebook had also researched systems that would automatically detect offensive language and screen it out, which could be applied to Blender’s output in the future.

More must-read tech coverage from Fortune:

—Who is new AT&T CEO John Stankey?
—Work-from-home tips from the executive team that brought you Zoom
—Is A.I. better at diagnosing illnesses than doctors? Don’t believe the hype
—Facebook debuts Zoom-like video chat feature called Messenger Rooms
—Listen to Leadership Next, a Fortune podcast examining the evolving role of CEOs
—WATCH: Zoom’s ups and downs since the coronavirus crisis

Catch up with Data Sheet, Fortune’s daily digest on the business of tech.

Trendingnow

1

2

3

Facebook creates the most ‘human’ chatbot yet

More must-read tech coverage from Fortune: