• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
TechAI

Meet the A.I. that helped Facebook remove billions of fake accounts

Jeremy Kahn
By
Jeremy Kahn
Jeremy Kahn
Editor, AI
Down Arrow Button Icon
Jeremy Kahn
By
Jeremy Kahn
Jeremy Kahn
Editor, AI
Down Arrow Button Icon
March 4, 2020, 7:00 AM ET

Facebook has lifted the curtain on a key technology that has enabled it to address one of its toughest challenges: eliminating fake accounts used for everything from spam ad campaigns to the spread false information.

The Internet media giant revealed details on Wednesday of how it designed an artificial intelligence system and trained it to be accurate enough to automatically detect accounts that violate its policies.

Policing its vast social network has become an increasingly existential problem for the company as faces the growing threat of regulation worldwide. The public and lawmakers have been dismayed by the role the social network has played in everything from Russian interference in the 2016 U.S. Presidential election to Myanmar’s genocide against the Rohingya Muslim population. Government officials and users have also become alarmed about hate speech, bullying, phishing, and financial fraud perpetrated on the platform.

Five years ago, Facebook relied largely on users to flag offending accounts to human reviewers. But the volume of problematic accounts Facebook has to deal with is massive: in the third quarter of 2019, the last period for which the company has released numbers, Facebook blocked some 1.7 billion offending accounts. And that doesn’t even include accounts the company prevents from ever being created in the first place, said Bochra Gharbaoui, a data science manager on Facebook’s Community Integrity team. At any time, Facebook estimates that 5% of its active accounts are fraudulent.

Relying on human reviewers has created other problems too. Facebook has used contract workers to review suspect content and behavior, but these workers are often low-paid and suffer mental health issues due to their constant exposure to disturbing posts, images, and videos.

Mark Zuckerberg, Facebook’s founder and chief executive, told U.S. lawmakers in 2018 that A.I. would help the company deal with the flood of problematic content. But it is only recently that the company’s researchers and engineers have started to make progress on fulfilling Zuckerberg’s pledge.

Thanks to A.I.-enabled tools, in the third quarter of 2019, Facebook took action against 99.7% of the fake accounts it blocked before other users flagged them to a human review team, the company said.

Facebook has a difficult needle to thread when it blocks accounts: it wants to catch and stop all the policy violations, including every fake account, without inadvertently blocking legitimate users. But if its criteria for detecting violations and taking action is too loose, other users will be victimized and the company itself could find itself at the center of another public relations debacle.

Both false positives and false negatives need to be minimized, Gharbaoui said. “This is a very hard tradeoff,” she said.

The problem is also difficult because scam artists, fraudsters and, yes, some governments, are always trying to figure out ways around Facebook’s defenses, explained Brad Shuttleworth, a Facebook product manager for community integrity.

The machine learning technique Facebook created, which it calls “deep entity classification,” or DEC for short, could be adapted by other companies that need to moderate conversations and content, such as rival social networks, messaging apps or video game companies, said Daniel Bernhardt, engineering manager in Facebook’s Community Integrity group in London, who worked on the system. The company is publishing the general architecture of DEC and details about how it was trained, but it is not making the trained model itself available to other companies.

DEC relies on several clever bits of thinking and engineering. The first was Facebook’ recognition that trying to train an algorithm by having it review standard account features—such as the IP address used to create the account, the age of the account, the number of likes a page has, or how many other users the account was connected to—would result in a screening model that was either too easy for someone with malicious intentions to game, or that would produce too many false positives.

Facebook’s solution was to look at each account, not in isolation, but in the context of all the other accounts and pages it was linked to, extended out to two degrees of separation. And then, instead of using direct features of that individual account, such as likes or friends, it fed the system aggregate metrics, such as the median number of Facebook friends across all those first and second order connections. (These metrics, by themselves, don’t indicate whether an account is legitimate. They are simply a way to vastly increase the number of metrics the model is analyzing so it can build a much more detailed statistical picture of the account.) This data, which Facebook calls “deep features,” is inherently more difficult for a malicious actor to tweak and result in far fewer numbers of false-positives or false-negatives.

Despite its vast size and the thousands of humans reviewers it employs to screen its content, Facebook said it is prohibitively time-consuming and expensive to create a high-quality, human-labelled dataset large enough to train a machine-learning algorithm to detect each type of abuse (such as fake accounts, spammers, financial scams or compromised accounts) with the kind of 99%-plus accuracy that Facebook needs.

So Facebook’s second clever bit of engineering was to figure out how to take a small, high-quality human-labelled dataset, which would normally be too small to train a highly-accurate deep learning algorithm, and enhance it by also using a much larger, computer-labelled, but less accurate, dataset. It does this by dividing the system into two separate modules.

In the first module, Facebook takes the set of deep features for each account and runs them through a multi-layer neural network, a kind of machine learning software loosely based on the human brain. In this case, the algorithm must learn what pattern of deep features correlates with what kind of account: is it a normal account or spam account or phishing account, etc.? And it learns to do this by referring to a large set of training samples, consisting of 5 million examples of fake accounts, that have themselves been rather crudely labelled by separate pieces of existing software.

Facebook then takes that statistical pattern for each account type and feeds it into the second module, where a different kind of machine-learning algorithm, called a gradient-boosted decision-tree, scores each account for the same categories —spam, fake account, phishing, bullying, etc.—but based on a much smaller set of high-quality, human-labelled training data. (In the case of fake accounts, about 100,000 human-labelled examples.) The results of this scoring then determine whether and what action Facebook will take against the account.

This results in a system that is more than 97% accurate in classifying accounts, far better than other methods could achieve.

The system is not designed to spot political disinformation campaigns, Shuttleworth said. Instead, Facebook has a separate “information operations” team working to combat that problem—including, in some cases, the use of differently-constructed machine learning algorithms.

Facebook is not the only company working with artificial intelligence that has found benefits from splitting a problem into two separate modules that feed one another. DeepMind, the A.I. research company owned by Google-parent Alphabet, used a similar two-step approach when it developed a system to spot over 50 sight-threatening eye conditions from eye scans. One module, which does computer vision, identifies features in the scans, while the second module makes a diagnosis based on these features. The system has the added advantage of being far more interpretable than a single black box module.

More must-read stories from Fortune:

—How 5G promises to revolutionize farming
—Did the ‘techlash’ kill Alphabet’s city of the future?
—College backlash against facial recognition technology grows
—In A.I., what would Jesus do?
—Coronavirus is giving China cover to expand its surveillance. What happens next?

Catch up with Data Sheet, Fortune’s daily digest on the business of tech.

About the Author
Jeremy Kahn
By Jeremy KahnEditor, AI
LinkedIn iconTwitter icon

Jeremy Kahn is the AI editor at Fortune, spearheading the publication's coverage of artificial intelligence. He also co-authors Eye on AI, Fortune’s flagship AI newsletter.

See full bioRight Arrow Button Icon

Latest in Tech

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Tech

crew aboard artemis II
Innovationspace
‘It’s 13 minutes of things that have to go right’: Artemis II splashes down despite faulty heat shield
By Catherina GioinoApril 10, 2026
3 hours ago
The Navy confirmed an ‘abundant amount’ of Uncrustables when the Artemis II crew lands. Smucker’s just offered them a lifetime supply
PoliticsFood and drink
The Navy confirmed an ‘abundant amount’ of Uncrustables when the Artemis II crew lands. Smucker’s just offered them a lifetime supply
By Catherina GioinoApril 10, 2026
6 hours ago
Three people sit behind a desk and look at the phone screen of the person in the middle.
Future of WorkConsulting
Meet ‘trendslop,’ the new, AI-fueled scourge of workplace consultants everywhere
By Sasha RogelbergApril 10, 2026
6 hours ago
Amazon is still paying Jeff Bezos an $80,000 yearly salary—but $1.6 million for travel and security
Big TechCEO salaries and executive compensation
Amazon is still paying Jeff Bezos an $80,000 yearly salary—but $1.6 million for travel and security
By Marco Quiroz-GutierrezApril 10, 2026
7 hours ago
Kash Patel sits with his two fingers on lips
CybersecurityIran
First they went after medtech, then Kash Patel. Iranian hackers’ next target is likely ‘low-hanging fruit’ in water, energy, and tourism, experts say
By Jacqueline MunisApril 10, 2026
8 hours ago
scott bessent
CybersecurityFederal Reserve
The AI that found 27-year-old vulnerabilities no human ever caught before just forced an emergency meeting with every major Wall Street CEO
By Jake AngeloApril 10, 2026
10 hours ago

Most Popular

A Meta employee created a dashboard so coworkers can compete to be the company's No. 1 AI token user—and Zuckerberg doesn't even rank in the top 250
AI
A Meta employee created a dashboard so coworkers can compete to be the company's No. 1 AI token user—and Zuckerberg doesn't even rank in the top 250
By Fortune EditorsApril 9, 2026
2 days ago
The U.S. government is spending $88 billion a month in interest on national debt—equal to spending on defense and education combined
Economy
The U.S. government is spending $88 billion a month in interest on national debt—equal to spending on defense and education combined
By Fortune EditorsApril 9, 2026
2 days ago
Mark Cuban admits he made a mistake letting go of the Mavericks: 'I don't regret selling. I regret who I sold to'
Investing
Mark Cuban admits he made a mistake letting go of the Mavericks: 'I don't regret selling. I regret who I sold to'
By Fortune EditorsApril 9, 2026
1 day ago
Schools across America are quietly admitting that screens in classrooms made students worse off and are reversing years of tech-first policies
Innovation
Schools across America are quietly admitting that screens in classrooms made students worse off and are reversing years of tech-first policies
By Fortune EditorsApril 10, 2026
19 hours ago
Scottie Scheffler joined Tiger Woods and Rory McIlroy in golf's $100M club—and donated his entire Ryder Cup stipend to charity
Success
Scottie Scheffler joined Tiger Woods and Rory McIlroy in golf's $100M club—and donated his entire Ryder Cup stipend to charity
By Fortune EditorsApril 10, 2026
12 hours ago
'I hate working 5 days': Zoom CEO says traditional work schedules are becoming obsolete—and predicts a 3-day workweek by 2031
Success
'I hate working 5 days': Zoom CEO says traditional work schedules are becoming obsolete—and predicts a 3-day workweek by 2031
By Fortune EditorsApril 9, 2026
2 days ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.