Facebook has revealed that the artificial intelligence systems it uses to police its social media sites are now good enough to automatically flag more than 94% of hate speech on its social media sites, as well as catching more than 96% of content linked to organized hate groups.
This represents a rapid leap in Facebook’s capabilities—in some cases, these A.I. systems are five times better at catching content that violates the company’s policies than they were just one year ago.
And yet this technological progress isn’t likely to do much to improve Facebook’s embattled public image as long as the company continues to make exceptions to its rules for powerful politicians and popular, but extremist, media organizations.
In recent weeks, Facebook has been under fire for not doing more to slow the false claims about the election made by U.S. President Donald Trump and not banning former Trump advisor Steve Bannon after he used Facebook to distribute a podcast in which he called for the beheading of two U.S. officials whose positions have sometimes angered the president.
Facebook did belatedly label some of Trump’s posts, such as ones in which he said he had won the election, as misleading and appended a note saying that “ballot counting will continue for days or weeks” to some of them. But critics said it should have removed or blocked these posts completely. Rival social media company Twitter did temporarily block new posts from the official Trump campaign account as well as those from some Trump advisors during the run-up to the election. Facebook said Trump’s posts fell within a “newsworthiness” exemption to its normal policies.
As for Bannon’s posts, Facebook CEO Mark Zuckerberg said they had been taken down but that the rightwing firebrand had not violated the company’s rules frequently enough to warrant banning him from the platform.
Mike Schroepfer, Facebook’s chief technology officer, acknowledged that efforts to strengthen the company’s A.I. systems so they could detect—and in many cases automatically block—content that violates the company’s rules were not a complete solution to the company’s problems with harmful content.
“I’m not naive about this,” Schroepfer said. “I’m not saying technology is the solution to all these problems.” Schroepfer said the company’s efforts to police its social network rested on three legs: technology capable of identifying content that violated the company’s policies, the capability to quickly act on that information to prevent that content from having an impact and the policies themselves. Technology could help with the first two of those, but could not determine the policies, he added.
The company has increasingly turned to automated systems to help augment the 15,000 human content moderators, many of them contractors, that it employs across the globe. This year for the first time, Facebook began using using A.I. to determine the order in which content is brought before these human moderators for a decision on whether it should remain up or be taken down. The software prioritizes content based on how severe the likely policy violation may be and how likely the piece of content is to spread across Facebook’s social networks.
Schroepfer said that the aim of the system is to try to limit what Facebook calls “prevalence”—a metric which translates roughly into how many users might be able to see or interact with a given piece of content.
The company has moved rapidly to put several cutting-edge A.I. technologies pioneered by its own researchers into its content moderation systems. These include software that can translate between 100 languages without using a common intermediary. This has helped the company’s A.I. to combat hate speech and disinformation, especially in less common languages for which it has far fewer human content moderators.
Schroepfer said the company had made big strides in “similarity matching”—which tries to determine if a new piece of content is broadly similar to another one that has already been removed for violating Facebook’s policies. He gave an example of a COVID-19-related disinformation campaign—posts falsely claiming that surgical face masks contained known carcinogens—which was taken down after review by human fact-checkers and a second post that used slightly differently language and a similar, but not identical face mask image, which an A.I. system identified and was able to automatically block.
He also said that many of these systems were now “multi-modal”—able to analyze text in conjunction with images or video and sometimes also audio. And while Facebook has individual software designed to catch each specific type of malicious content—one for advertising spam and one for hate speech, for example—it also has a new system it calls Whole Post Integrity Embedding (WPie for short) that is a single piece of software that can identify a whole range of different types of policy violations, without having to be trained on a large number of examples of each violation type.
The company has also used research competitions to try to help it build better content moderation A.I. Last year, it announced the results of a contest it ran that saw researchers build software to automatically identify deepfake videos, highly-realistic looking fake videos that are themselves created with a machine learning technique. It is currently running a competition to find the best algorithms for detecting hateful memes—a difficult challenge because a successful system will need to understand how the image and text in a meme affect meaning as well as possibly understand a lot of context not found within the meme itself.