• Home
  • News
  • Fortune 500
  • Tech
  • Finance
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
TechThe Mobile Executive

A New Kind of AI Spots 90% of Online Abuse

By
David Z. Morris
David Z. Morris
Down Arrow Button Icon
By
David Z. Morris
David Z. Morris
Down Arrow Button Icon
July 30, 2016, 3:04 PM ET
105208312
Troll Road Sign, Trollstigen (The Troll Path)Photograph by Douglas Pearson — Getty Images

Researchers at Yahoo (yes, for the moment, it’s still Yahoo) have unveiled an algorithm that uses machine learning and natural language processing to detect online abuse and hate speech. Abusive behavior online has been in the limelight lately, both because it’s so inherently vile, and because it could alienate users of platforms like Twitter (TWTR) and Yahoo (YHOO), arguably threatening their bottom line, or even the entire digital economy.

Most such platforms use a combination of user reporting, keyword filtering, and monitoring by legions of trained humans to detect and block trolls and harassers. But filters are easy to work around through creative spelling (the example “kill yrslef a$$hole” pops up early in the researchers’ report).

Get Data Sheet, Fortune’s technology newsletter.

Slurs and insults also shift rapidly, making blacklists ineffective, while some more subtle abuse can be expressed without any single objectionable word. All of that – plus the likelihood of false positives from sarcastic or satirical posts—makes the problem a thorny one for artificial intelligence.

The Yahoo researchers set their AI to evaluate a set of messages already flagged as abusive for common traits. The comment dataset came from Yahoo! Finance and News, which you wouldn’t think of as exactly the dank basement of the internet—but it turns out a whopping 7% of comments on Finance and 16.4% on News were deemed abusive by human screeners.

The program trained itself by scanning those comments for specific sequences of characters, which helped it catch non-standard spellings of offensive words. The processor also tracked linguistic features like comment length, use of capital letters, and punctuation style. It could even parse so-called “dependencies” to find complex phrases that added up to abuse.

The program was then tested by comparing its judgment to the majority opinion of human screeners. At its best, researchers found that their model was more accurate than prior models by a substantial margin, matching human judgment in as many as 90% of its classifications.

For more on the problem of online abuse, watch our video.

What’s most interesting about the results is that the model was most effective when its ‘training’ was updated with new data over time, indicating how fluid online abuse is. In fact, while larger data sets produced better results, even using a much smaller but more recent comment database led to fairly accurate results, which could be an important finding from an efficiency perspective.

The researchers have said they will soon make their datasets available through Yahoo’s Webscope program. However, that database is explicitly available for use only by non-commercial researchers—which means this work may wind up being a part of Yahoo that’s actually worth something to its new owners.

About the Author
By David Z. Morris
See full bioRight Arrow Button Icon

Latest in Tech

Big TechSpotify
Spotify users lamented Wrapped in 2024. This year, the company brought back an old favorite and made it less about AI
By Dave Lozo and Morning BrewDecember 4, 2025
7 hours ago
InnovationVenture Capital
This Khosla Ventures–backed startup is using AI to personalize cancer care
By Allie GarfinkleDecember 4, 2025
12 hours ago
AIEye on AI
Companies are increasingly falling victim to AI impersonation scams. This startup just raised $28M to stop deepfakes in real time
By Sharon GoldmanDecember 4, 2025
12 hours ago
Jensen Huang
SuccessBillionaires
Nvidia CEO Jensen Huang admits he works 7 days a week, including holidays, in a constant ‘state of anxiety’ out of fear of going bankrupt
By Jessica CoacciDecember 4, 2025
12 hours ago
Ted Pick
BankingData centers
Morgan Stanley considers offloading some of its data-center exposure
By Esteban Duarte, Paula Seligson, Davide Scigliuzzo and BloombergDecember 4, 2025
12 hours ago
Zuckerberg
EnergyMeta
Meta’s Zuckerberg plans deep cuts for metaverse efforts
By Kurt Wagner and BloombergDecember 4, 2025
12 hours ago

Most Popular

placeholder alt text
Economy
Two months into the new fiscal year and the U.S. government is already spending more than $10 billion a week servicing national debt
By Eleanor PringleDecember 4, 2025
17 hours ago
placeholder alt text
Success
‘Godfather of AI’ says Bill Gates and Elon Musk are right about the future of work—but he predicts mass unemployment is on its way
By Preston ForeDecember 4, 2025
13 hours ago
placeholder alt text
North America
Jeff Bezos and Lauren Sánchez Bezos commit $102.5 million to organizations combating homelessness across the U.S.: ‘This is just the beginning’
By Sydney LakeDecember 2, 2025
3 days ago
placeholder alt text
Success
Nearly 4 million new manufacturing jobs are coming to America as boomers retire—but it's the one trade job Gen Z doesn't want
By Emma BurleighDecember 4, 2025
13 hours ago
placeholder alt text
Success
Nvidia CEO Jensen Huang admits he works 7 days a week, including holidays, in a constant 'state of anxiety' out of fear of going bankrupt
By Jessica CoacciDecember 4, 2025
12 hours ago
placeholder alt text
Health
Bill Gates decries ‘significant reversal in child deaths’ as nearly 5 million kids will die before they turn 5 this year
By Nick LichtenbergDecember 4, 2025
1 day ago
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Leadership
  • Success
  • Tech
  • Asia
  • Europe
  • Environment
  • Fortune Crypto
  • Health
  • Retail
  • Lifestyle
  • Politics
  • Newsletters
  • Magazine
  • Features
  • Commentary
  • Mpw
  • CEO Initiative
  • Conferences
  • Personal Finance
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map

© 2025 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.