Women have been telling Twitter for years that they endure a lot of abuse on the platform. A new study from human rights watchdog Amnesty International attempts to assess just how much. A lot, it turns out.
About 7 percent of the tweets prominent women in government and journalism receive were found to be abusive or problematic. Women of color were 34 percent more likely to be targets than white women. Black women specifically were 84 percent more likely than white women to be mentioned in problematic tweets.
After an analysis that eventually included almost 15 million tweets, Amnesty International released the findings and in its report, described Twitter as a “toxic place for women.” The organization, which is perhaps best known for its efforts to free international political prisoners, has turned its attention to tech firms lately, and it called on the social network to “make available meaningful and comprehensive data regarding the scale and nature of abuse on their platform, as well as how they are addressing it.”
“Twitter has publicly committed to improving the collective health, openness, and civility of public conversation on our service,” Vijaya Gadde, Twitter’s head of legal, policy, and trust and safety, said in a statement in response to the report. “Twitter’s health is measured by how we help encourage more healthy debate, conversations, and critical thinking. Conversely, abuse, malicious automation, and manipulation detract from the health of Twitter. We are committed to holding ourselves publicly accountable towards progress in this regard.”
Together with Montreal-based AI startup Element AI, the project called “Troll Patrol” started by looking at tweets aimed at almost 800 female journalists and politicians from the U.S. and the U.K. It didn’t study men. More than 6,500 volunteers analyzed 288,000 posts and labeled the ones that contained language that was abusive or problematic (“hurtful or hostile content” that doesn’t necessarily meet the threshold for abuse).
Each tweet was analyzed by three people, according to Julien Cornebise, who runs Element’s London office, and experts on violence and abuse against women also spot-checked the volunteers’ grading. The project also wanted to use those human judgments to build and test a machine-learning algorithm that could flag abuse—in theory, the kind of thing a social network like Twitter might use to protect its users.
Cornebise’s team used machine learning to extrapolate the human-generated analysis to a full set of 14.5 million tweets mentioning the same figures. They also made sure the tweets examined by the volunteers were representative and that the findings were accurate. Then his team used the data created to train an abuse-detecting algorithm and compared the algorithm’s conclusions to those of the volunteers and experts. This kind of work is becoming increasingly important as companies like Facebook Inc. and YouTube use machine learning to flag content that needs moderation. In a letter responding to the Amnesty International report, Twitter has called machine learning “one of the areas of greatest potential for tackling abusive users,” the group said in the report.
The algorithm Cornebise’s team built did pretty well, he said, but not well enough to replace humans as content moderators. Instead, machine learning can be one tool that helps the people in these jobs. Defining abuse often requires an understanding of context or how words are interpreted in certain parts of the world—judgement calls that are harder to teach an algorithm.
“Abuse is itself very contextual and perception of abuse can vary from region to region,” he said. “There is too much subtlety and context and algorithms can’t solve that yet.” Perhaps the women of Twitter could help them out.