Why Microsoft and Twitter are turning to bug bounties to fix their A.I.

August 10, 2021, 6:32 PM UTC

For years, companies have hosted bug bounty programs to entice well-meaning hackers to spot flaws in software so they can patch them. The programs—participants usually get money for flagging securities holes—are a recognition by businesses that they can’t find every vulnerability on their own.

Now, tech companies like Microsoft, Nvidia, and Twitter are hosting bug bounty programs specifically for artificial intelligence. The goal is for outsiders to spot flaws in A.I. software so that companies can improve the technology and reduce the risk of machine learning discriminating against certain groups of people.

For example, last week, Microsoft and Nvidia detailed a new bug bounty program during the annual Defcon hacker conference. The companies plan to reward hackers who manage to alter computer viruses so that they go undetected by some of Microsoft’s A.I.-powered malware-detection services. Hackers who can create scammy emails that evade Microsoft’s machine-learning powered email phishing detection software will also earn some money in the form of Microsoft gift cards and other prizes.

Meanwhile, Twitter pitched a bug bounty aimed at spotting bias in its A.I. The program comes after users discovered that Twitter’s image-cropping tool disproportionately removed women and people of color from photos so that the images would feature white men in the center. 

Outsiders were invited to inspect and find flaws in the now-deactivated machine-learning algorithm that powered Twitter’s photo cropping tool.

Researchers discovered other bias problems with the same algorithm used in the image-cropping tool. One discovered that it would tend to crop older people from photos. Another found that the algorithm would remove people wearing head garments, showing a bias against those wearing turbans, yamakas, and hijabs.

The first-place winner of Twitter’s bug bounty used A.I. to modify photos of people’s faces to be more appealing to the algorithm. Through this process, the researcher discovered that the algorithm favored faces that were thin, young, and white—all indications that the technology was trained on datasets mostly of people who conform to today’s conventions of beauty.

It’s unclear what Twitter will do with the findings, but executives implied that they would be used to improve the company’s tech.

During a panel related to Twitter’s bug bounty program, data scientist Patrick Hall reflected on the need for more scrutiny of corporate A.I. He expressed surprise that A.I.-tailored bug bounty programs haven’t become widely adopted considering the technology’s many flaws.

“Just because you haven’t found bugs in your enterprise A.I. and machine learning offerings, certainly doesn’t mean they don’t have bugs,” Hall said. “It just means that someone you don’t know might be exploiting them, and I think for those of us in the responsible A.I. community, we wanted people to try bug bounties for so long.”

Jonathan Vanian 


Apple’s machine learning dilemma. Apple said it would use machine learning technology on people’s iPhones to “detect known images of child sexual abuse without decrypting people's messages,” the Associated Press reported. Privacy advocates have expressed concern that the move could open the door for authoritarian governments to monitor and surveil citizens, a notion that Apple disputes. From the article: Apple was one of the first major companies to embrace “end-to-end” encryption, in which messages are scrambled so that only their senders and recipients can read them. Law enforcement, however, has long pressured for access to that information in order to investigate crimes such as terrorism or child sexual exploitation.

A.I. was relatively useless during the COVID-19 pandemic. Despite high hopes that A.I. could have been useful to overwhelmed healthcare professionals on the front lines of the COVID-19 pandemic, several recent studies show that newly developed A.I. tools did not make “a real difference, and some were potentially harmful,” according to a report by the MIT Technology Review. One team of researchers, for instance, examined “415 published tools” and discovered “that none were fit for clinical use.” One of the primary culprits of the healthcare A.I. failure was that technologists creating the tools held “incorrect assumptions” about the data used to train the machine learning systems. In one of the most egregious failures, researchers discovered that some machine learning systems were “picking up on the text font that certain hospitals used to label the scans. As a result, fonts from hospitals with more serious caseloads became predictors of COVID risk.”

Talk about the weather. Several startups are attempting to use machine learning to analyze weather data so that companies can understand potential risks to their businesses, The Wall Street Journal reported. Some of these startups use neural networks, software designed to loosely mimic how the human brain learns, but the article noted there are some flaws with that particular data-hungry A.I. technique. From the article: Companies need adequate data to train their models and there isn’t always enough data. One example is hail, where limited observations make it hard to train AI models, said Mr. Gupta of ClimateAi.

A.I. don’t come cheap. Global spending on A.I. technologies is projected to grow 15.2% year-over-year to $341.8 billion for 2021, according to a new report from market research firm International Data Corporation. The report appears to take a broad view of A.I., lumping everything from certain servers sold by companies like Dell and Hewlett Packard Enterprise to spending on enterprise software like Slack and McAfee as inputs that inform the overall A.I. market. From the report: AI Hardware is the smallest category with 5% share of the overall AI market. Nonetheless, it is forecast to grow the fastest in 2021 at 29.6% year over year.


Nym Health hired Melisa Tucker to be the startup’s senior vice president and head of product. Tucker was previously the vice president of product management at operations at Flatiron Health.

Clearwater Analytics picked Souvik Das to be enterprise software company’s chief technology officer. Das was previously the CTO of Zenefits.


Inspect the datasets. Researchers from Princeton University published a non peer-reviewed paper that probes some of the ethical dilemmas associated with developing A.I. systems built using problematic datasets, such as those that contain photos of people who never consented to be part of the dataset. The researchers analyzed 1,000 academic papers and found that despite some of the problematic datasets being retracted, many researchers continued to develop A.I. systems with the datasets or their derivatives.

The researchers believe that the creators of massive datasets used to train A.I. systems “should continuously steward a dataset, actively examining how it may be misused, and making updates to license, documentation, or access restrictions as necessary.”

One interesting tidbit from the paper: Princeton researchers discovered that other researchers are confused about the possible legal repercussions of developing A.I. systems based on non-commercial datasets.

From the paper: From these posts, we found anecdotal evidence that non-commercial dataset licenses are sometimes ignored in practice. One response reads: “More or less everyone (individuals, companies, etc) operates under the assumption that licenses on the use of data do not apply to models trained on that data, because it would be extremely inconvenient if they did.” Another response reads: “I don’t know how legal it really is, but I’m pretty sure that a lot of people develop algorithms that are based on a pretraining on ImageNet and release/sell the models without caring about legal issues. It’s not that easy to prove that a production model has been pretrained on ImageNet ...”



This hot startup is now valued at $1 billion for its A.I. skills—By Jonathan Vanian

5 questions for Lyft co-founder John Zimmer—By Michal Lev-Ram

Tesla’s Bitcoin bet is back in the black—big time—By Shawn Tully

China’s Big Tech billionaires increase philanthropic giving as Beijing cracks down—By Yvonne Lau

Tech’s delivery problem: It doesn’t end at your door—By Kevin T. Dugan


What worked for Google and Facebook won’t work for your company. Deep learning pioneer Andrew Ng wrote an opinion piece for the Harvard Business Review discussing some of the reasons why non-tech companies struggle with A.I. compared to consumer Internet firms like Google (Ng once worked at the search giant) and Facebook. Ng writes that the A.I. “playbook” used for Internet giants won’t work for other industries because of multiple reasons. For one, non-tech giants lack an abundance of quality data that can be used to train A.I. systems.

Additionally, tech giants can employ huge A.I. teams to help run their financially lucrative online advertising businesses. But not every A.I.-powered business will be as profitable as an online advertising business. Instead, many companies have individual businesses that can benefit from A.I., but they are less likely to result in whopping profits, posing a challenge for companies that are seeking a massive return on investment.

As Ng writes, “The aggregate value of these hundreds of thousands of these projects is massive; but the economics of an individual project might not support hiring a large, dedicated AI team to build and maintain it.”

One of Ng’s A.I. tips for companies: “Instead of merely focusing on the quantity of data you collect, also consider the quality, make sure it clearly illustrates the concepts we need the AI to learn.”

Our mission to make business better is fueled by readers like you. To enjoy unlimited access to our journalism, subscribe today.

Read More

CEO DailyCFO DailyBroadsheetData SheetTerm Sheet