Can A.I. help Facebook cure its disinformation problem?

In addition to testing American democracy, November’s election and the subsequent storming of the U.S. Capitol put social media to the test. Facebook and its rivals have spent years creating technology to combat the spread of disinformation, violent rhetoric, and hate speech. By some measure, the systems did better than ever in filtering out hundreds of millions of inflammatory posts. But ultimately the technology failed, allowing many similar posts to slip through.

In the days leading up to the election, unsubstantiated claims of widespread voting irregularities were the most shared content on Facebook, according to data analytics company CrowdTangle. At the top of the list were then-President Donald Trump’s posts falsely claiming there had been thousands of “fake votes” in Nevada and that he had won Georgia. Meanwhile, the top news stories on Facebook preceding the election were from far-right news sites such as Breitbart and Newsmax that played up specious voter fraud claims. Such falsehoods set the stage for the Capitol’s storming.

No company has been as vocal a champion of using artificial intelligence to police content as Facebook. CEO Mark Zuckerberg has repeatedly said, as he did in 2018 congressional testimony, that “over the long term, building A.I. tools is going to be the scalable way to identify and root out most of this harmful content.”

Translation: The problem is so big that humans alone can’t police the service.

Facebook has invested heavily to try to make good on its tech-centric solution. And there is some evidence of progress. For instance, of all the terrorism-related content it removes, Facebook says its A.I. helps find 99.8% of those posts before users flag them. For graphic and violent content, the number is 99.5%. And for hate speech, it’s 97%. That’s significantly better than three years ago, largely because of improvements in machine learning.

But success can be subjective. Facebook has a blanket policy against nudity, for instance. Yet the company’s independent Oversight Board, a sort of appeals court for users unhappy with Facebook’s moderating decisions, recently faulted it for blocking images in breast cancer awareness campaigns. Regulators want Facebook to block terrorist videos that are being used to radicalize young recruits, but not block those same videos when used on news programs. It’s a distinction A.I. struggles to make.

The meaning of language depends on context too. Studies show humans can identify sarcasm only about 60% of the time, so expecting A.I. to do better is a stretch, says Sandra Wachter, a tech law professor at the University of Oxford’s Internet Institute.

Eric Goldman, a Santa Clara University law professor, puts it another way: “One problem A.I. can never fix is the problem of context that doesn’t come from within the four corners of the content itself.”

Not that Facebook isn’t trying. It’s currently running a competition encouraging computer scientists to develop A.I. capable of detecting hateful memes. Memes are difficult because they require understanding of both images and text, and often a large amount of cultural information. “We recognize it is a tricky problem, which is why we published the data set and challenge, because we need to see innovation across the industry,” says Cornelia Carapcea, a product manager who works on Facebook’s A.I. moderating tools.

Misinformation—the harmful content that has most preoccupied Americans lately—is a challenge for A.I. because outside information is required to verify claims. For now, that requires human fact-checkers. But once misinformation is identified, A.I. can help check its spread. Facebook has developed cutting-edge A.I. systems that identify when content is essentially identical to something that’s already been debunked, even if it has been cropped or screenshotted in an attempt to evade detection. It can also now spot similar images and synonymous language, which in the past may have eluded automated filters.

These systems helped Facebook slap warnings on over 180 million pieces of content in the U.S. between March 1, 2020, and Election Day. If that’s a sign of A.I.’s success, it is also an indication of the problem’s scale. A.I. works best when the data it’s analyzing changes little over time. That’s not the case for hate speech or disinformation. What results is a cat-and-mouse game between those disseminating malicious content and Facebook’s systems.

Some blame Facebook for raising public expectations of what A.I. can achieve. “It is in their self-interest to overstate the efficiency of the technology if it will deflect further regulation,” Santa Clara University’s Goldman says.

It is in their self-interest to overstate the efficiency of the technology if it will deflect further regulation.
Eric Goldman, Santa Clara University

Others say the problem is more fundamental: Facebook makes money by keeping users on its platform so advertisers can market to them. And controversial content drives higher engagement. That means if harmful posts slip through Facebook’s dragnet, the company’s other algorithms will amplify them. “The business model is the core problem,” says Jillian York, a researcher at civil liberties nonprofit the Electronic Frontier Foundation.

In the days after the November election, with political tensions at a fever pitch, Facebook did tweak its News Feed algorithm to de-emphasize sources that were spreading misinformation and to boost news from higher-quality media outlets. But it rolled back the change weeks later.

Currently Facebook reduces the prominence of content it identifies as misinformation, shows warnings to those trying to share known misinformation, and notifies people if a story they have previously shared is later debunked. Users who repeatedly share misinformation are only rarely kicked off the service, but they “will see their overall distribution reduced and will lose the ability to advertise or monetize within a given time period,” the company says.

Facebook’s Carapcea says the company is considering similar measures for other harmful content. But humans will continue to play a big role in deciding when to apply them.

Says Carapcea: “Getting to 100% is a good North Star, but it may not ultimately be what happens here.”

A.I. in action

Facebook’s A.I. has had a mixed track record with helping identify and remove harmful content before users flag it. The following shows how much of the content in various categories Facebook removes that it finds without user input:

99.8%: Terrorism content
97.1%: Hate speech
92.8%: Glorification of suicide and self-harm
90%: Election suppression, misinformation, and threats (2018 election)
48.8%: Online bullying

Source: Facebook (Q4 2020, unless otherwise noted)

This article appears in the April/May issue of Fortune with the headline, “Facebook’s complicated cleanup.”