How Facebook Uses Artificial Intelligence to Teach Computers to Read
Each day on Facebook, millions of people comment about baby photos, discuss presidential hopeful Donald Trump’s latest musings, or share their thoughts on the latest blockbuster movie.
Put simply, Facebook users love to communicate on the social network, and that communication translates to a lot of text. All that text means that Facebook has tons of data revealing how people converse on the social network.
But Facebook (FB) doesn’t just want to store all that text in its vast data centers. It wants to analyze the way people communicate with one another so that it can add new features to its services or even automatically remove offensive posts that might occur when a high-profile celebrity like Justin Beiber posts a selfie.
To more easily sift through all this language data, Facebook has turned to artificial intelligence technologies. On Wednesday, the social network revealed its new AI-powered software system called Deep Text, which Facebook says can analyze thousands of posts a second in over 20 different languages while understanding the gist of what each post is attempting to convey.
Through this computerized version of speed-reading and basic comprehension, Facebook can build new tools and features like so-called chatbots that can potentially interact with a user when that person writes to his or her friends.
Deep Text was created by Facebook’s recently formed Applied Machine Learning team that developed the company’s artificial intelligence tool FBLearner Flow. FBLearner Flow is essentially Facebook’s core artificial intelligence tool that powers numerous data-intensive products, like Facebook’s news feed or language translation services.
Get Data Sheet, Fortune’s technology newsletter.
For Deep Text to sift through hundreds of thousands of words per second and understand the basics of each post it ingests, it relies on an artificial intelligence technique called deep learning. Deep learning has gained in popularity in recent years with companies like Google (GOOG) and Nvidia (NVDA) in using the technique to train computers to recognize objects in images.
With deep learning software, which requires enormous amounts of data and lots of computing power to operate, computers can be trained to recognize a cat in a photo without human help. Using so-called deep learning tailored neural networks, which are basically software systems designed to loosely mimic the way the human brain learns, computers can essentially break down an image of a cat into its bare elements. The computers can then figure out how all those elements of the image are related to each other and how they form a cat when pieced together.
Although deep learning has proven its worth at teaching computers to see, so to speak, the technique hasn’t had much success when attempting to learn text and speech, explained Hussein Mehanna, Facebook’s engineering director of its core machine learning groups.
But it so happens that two of Facebook’s top AI specialists, Ronan Collobert and Yann LeCun, have been researching how to apply deep learning to text recognition, which led to Facebook’s AI team to incorporate those techniques inside the social network itself.
In traditional natural language processing, the branch of AI used to teach computers to learn text, engineers have to do a lot of work prepping the text data into the right format before the computers can start learning. It’s a lot of engineering legwork to fix misspellings, clean up the text, and in Facebook’s case, have linguists on staff to prep each text in various languages. Additionally, any mistake an engineer makes in this pre-processing stage becomes magnified when the engineers run the algorithms that train the computers, Mehanna noted.
With Deep Text, Facebook can feed its computers raw text straight from a user’s comments and posts, and the machines should be able to discern relationships between words on their own by breaking down the texts into individual letters and even exclamation marks.
“When you let the machine learn from characters, the machine will learn to overcome misspellings by itself,” Mehanna said. “You don’t have to include that as a factor, you just let the machine figure it out.”
When a person writes something, that person is not simply “writing random text,” Mehanna stressed, explaining there are rules and patterns in language that the computers learn to discover on their own.
If the Deep Text software gets fed enough sentences containing the words “taxi” and “ride,” for example, it learns that those two words are related to each other. No human was needed to tell the computer that those two words are connected.
However, humans aren’t completely out of the picture for Facebook’s Deep Text software. Although deep learning can discover patterns and relationships between words on its own in an unsupervised manner, humans are still needed to refine the data further.
With deep learning, the computers learned on their own that “taxi” and “ride” are typically grouped together, but it still needs human help to ensure that the computers learn additional context. For example, a human needs to train a computer to recognize the difference between the sentences “I need a taxi” and “I just got out of a taxi.” The Deep Text software system is essentially a combination of unsupervised learning that trains computers to recognize basic relationships between words, and supervised learning, which further trains the computers to recognize more complex patterns in text.
For more about Facebook, watch:
For example, if someone writes a Facebook blog post about watching a play that took place at a church, the computer needs to recognize that the play is the most important topic in the post, not the church.
“You have to teach machines what matters,” Mehanna reminded.
Mehanna said that Facebook is exploring how to incorporate its Deep Text software into new products. The social network is currently testing the software to recognize when a Facebook user may be interested in selling or buying a product, he said. For example, if someone were to write a post to friends that said “I want to sell my bicycle,” the Deep Text system should recognize the person’s intent sell a product, which would then prompt a so-called software bot to jump in and ask if the person wants to sell the bike on Facebook.
Additionally, Facebook could use the tool to make it easy for people to hail taxi rides all within the social network. It’s just another example of how companies like Facebook, or even Baidu (BIDU) and Microsoft (MSFT), which are using AI technologies to better understand consumers and target them with specific services.