Startup Text IQ thinks A.I. can help uncover unconscious bias

This is the web version of Eye on A.I., Fortune’s weekly newsletter covering artificial intelligence and business. To get it delivered weekly to your in-box, sign up here.

There are a lot of stories about A.I. systems picking up the human biases lurking in the data used to train them. But can A.I. also help humans uncover their own unconscious biases?

That’s what Apoorv Agarwal and Omar Haroun think. They are the co-founders of New York-based startup Text IQ. The company’s natural language processing software is primarily used by large businesses to keep track of personal identifying information in their datasets. It helps ensure companies don’t accidentally disclose this personal information in violation of legal requirements or compliance policies. Its software is also useful in cases when a company suffers a data breach and has to inform people whose personal identifying information may have been compromised.

But not too long ago, Agarwal and Haroun went through “unconscious bias training” of the kind that many company HR departments have instituted as part of their diversity and inclusion efforts. And the pair suddenly had a brainwave: they could turn Text IQ’s systems into a tool to help their customers with unconscious bias.

The system works by scanning the written assessments that managers write for employees. It analyzes the language used in those assessments, classifying how positive or negative it is. It then classifies the terms used as either pertaining to work performance or more personality-centric attributes. For instance, a manager might rate his female employees more positively on average than his male employees, but might still be guilty of unconscious bias if in the performance reviews for female employees he primarily commends them for being “bubbly,” “having a positive attitude,” and “fitting in well with the team,” while in assessments of his male direct-reports, he praises their “superb presentation skills” and “attention to detail.”

Agarwal and Haroun told me they are still trying to figure out the best—and most ethical—way to allow customers to use this unconscious bias detection tool. “The way in which this is administered is incredibly important,” Haroun says. He says Text IQ doesn’t want to see companies using it to punish employees or managers for unconscious bias. “We don’t want a ‘gotcha moment,’” he says. Rather, Text IQ hopes companies will use it to help people uncover their own hidden biases so they have an opportunity to improve. “We are thinking about perhaps only making the detailed reports available to the person writing the performance reviews,” he says. Agarwal says that Text IQ might allow a corporate D&I team or HR department access to aggregate data for the entire company, or maybe give them access to anonymized data.

While initially intended for use in screening performance reviews, the tool could easily be adapted to help search for unconscious bias in the way hiring managers write up notes from job candidate interviews.

Text IQ has thought hard about how to address algorithmic bias too. Agarwal’s view is that human-labelled data, which is what is used most often to train A.I. systems in business, is inherently biased and error-prone. Plus, the expense and time needed to create most human-annotated datasets means that they will almost always be smaller than unlabeled datasets. That means each piece of data carries more weight in training the A.I., and any bias or error on the part of the individual applying the labels is likely to be amplified. He says he prefers to use unsupervised learning methods, where A.I. systems learn from very large amounts of unlabeled data. While there are lots of errors and often latent societal biases in this unlabeled data too, Agarwal says he thinks the ability to train a system on a much larger dataset tends to mitigate against some of the ill effects. He is also a fan of synthetic data.

With that, here’s the rest of this week’s A.I. news.

Jeremy Kahn
jeremy.kahn@fortune.com
@jeremyakahn

A.I. IN THE NEWS

Don't trust that medical imagery A.I. detecting COVID-19. That's the conclusion of a major study of 62 papers that have been published in the past year in which the authors claim to have built machine learning systems able to diagnose COVID-19 from various kinds of medical imagery (mostly X-rays and CT scans). The study, according to a story in tech publication VentureBeat, found that half of the papers had made no attempt to validate the training data that had been used, assess how sensitive or robust their system was, and did not report the demographics of those individuals represented in the training set. “In their current reported form, none of the machine learning models included in this review are likely candidates for clinical translation for the diagnosis/prognosis of COVID-19,” the study concluded.

Myanmar's military made have created a deepfake to depict a jailed minister admitting to corruption. That is the suggestion some journalists and Myanmar experts were making after video of Phyo Min Thein, a detained former chief minister of the Yangon region, appeared on a military-owned television network in which he seemingly confessed to offering bribes to ousted national leader Aung San Suu Kyi. According to a story in Asian tech publication KrASIA, many people who were familiar with Phyo Min Thein said the voice heard in the video did not seem to be his and they speculated that the Myanmar military, which toppled the civilian government in a coup in February and arrested Suu Kyi, may have used deepfake technology to alter the video. But others noted that the video is too low-quality to be able to tell for sure and some even speculated that the military had deliberately released a very low-quality version in order to avoid detection.

Speaking of deepfakes, the FBI warns they are likely to be used in disinformation campaigns or possibly espionage in the next 12 to 18 months. "Malicious actors almost certainly will leverage synthetic content for cyber and foreign influence operations in the next 12-18 months," the FBI warned in a March 10 advisory, according to Business Insider. The agency said it suspected Russian, Chinese and "Chinese-language actors" were planning to increase their use of deepfakes in "cyber and foreign influence campaigns" in the coming months.

Researcher turns down grant from Google in protest over the company's treatment of its former A.I. ethics researchers. Luke Stark, an assistant professor at Western University in Ontario, Canada, turned down a $60,000 academic grant from the Internet giant. He said he was doing so, according to a CNN story, in solidarity with Timnit Gebru and Margaret Mitchell, the two former co-heads of Google's A.I. ethics team within its A.I. research division, both of whom were forced out of the company in recent months. Other scholars have recently declined to participate in conferences that Google has organized citing similar concerns about how Gebru, Mitchell and the rest of the A.I. ethics team has been treated.

Amazon is forcing its delivery drivers to consent to monitoring with A.I.-enabled cameras. The Everything Store has now begun installing cameras which use A.I. to detect 16 different types of dangerous driver behavior in all of its delivery vehicles. The company is telling drivers the cameras will capture biometric data about them, which Amazon says it will store for 30 days, and that they must consent to this or lose their jobs, according to Vice. Some drivers have refused and quit. The cameras and software Amazon is using are made by Netradyne, a company the specializes in software that helps companies manage large vehicle fleets.

British labor organization sounds alarm about A.I.-related job losses. The Trades Union Congress (TUC), one of the largest organized labor confederations in Britain, published a report raising alarm at the way A.I. is being implemented in many workplaces, with workers increasingly answering to management decisions that are made, or heavily influenced by, algorithms. The organization called for new legal protections to prevent workers from being "hired and fired" by A.I., including an obligation that employers consult with unions about the deployment of "high risk" or "intrusive A.I." at work; the legal right to a human review of decisions; a legal right to "switch off" from work and not be responsible for answering calls and emails during certain hours; and changes to U.K. law to protect against discrimination by algorithm.

EYE ON A.I. TALENT

Signal AI, a London-based company that uses A.I. to provide media and business intelligence, has appointed Josh Boaz to its advisory board, the company said in a statement. Boaz is co-founder and managing director of Direct Agents, a minority-owned, independent digital marketing agency based in New York and Los Angeles.

Counterflow AI, a Crozet, Virginia-based company that uses A.I. to monitor computer networks for cybersecurity threats and other issues, has appointed Bill Cantrell as chief executive officer, according to a company release. He had previously been the company's chief product officer.

Verusen, an Atlanta-based company that uses A.I. to help companies manage supply chains, has named Andrew Vaughan as its chief technology officer, the company said. He was previously vice president and head of engineering for Project 44, a software platform for shippers and logistics services.

Wipro, the business process outsourcing firm based in Bangalore, India, has named Subha Tatavarti as its chief technology officer, according to a story in Indian business paper Mint. She previously led product, technology development, and commercialization of enterprise infrastructure as well as security, data science and edge platforms, at Walmart.

This section has been updated to correct the location of Signal AI's headquarters. It is based in London, not New York.

EYE ON A.I. RESEARCH

Large language datasets for translation need more work. That's the conclusion of a large number of researchers who undertook a survey of five open-source very large, multi-language datasets encompassing more than 230 languages in total that are often used to train A.I. systems for translation. Their findings, published in a paper on the non-peer reviewed research repository arxiv.org, was that the datasets were very good for English and other common Western languages, such as French and German, but increasingly problematic when it came to other languages. They said that the datasets were particularly poor quality when it came to African languages and also languages in romanized letters that usually have their own scripts (such as Chinese, Urdu and Hindi). Errors include not just incorrect translations but texts labelled as the wrong language and content that wasn't language at all. The best of the five datasets, one called OSCAR, had about 87% correct content, while the worst, WikiMatrix, had just 24%. One dataset, CCAligned, contained seven languages for which none of the sentences at all were correctly labelled and 44 languages where less than 50% of the sentences were correctly labelled.

As the researchers note, if those using these datasets are made aware of the errors and even the relative error rates between languages in a given dataset, they can possibly develop strategies to filter, clean or otherwise compensate for the messy, incorrect data. But without doing such work, there's a real danger of training A.I. systems that will seem to work for some common languages but fail spectacularly on other ones, without those building the system realizing it.

FORTUNE ON A.I.

Google Maps wants to help users avoid getting stuck in the rain—by Danielle Abril

This startup has found a way to uncover bespoke cancer therapies—by Jeremy Kahn

3 heated and funny moments from Big Tech’s Congressional grilling today—by Danielle Abril

Amazon’s unionization vote comes at the worst time for company PR—by Aaron Pressman

BRAIN FOOD

Tell me a story. There is perhaps nothing that sets humans apart from other creatures as much as our love of story-telling. In fact, many cognitive scientists think the ability to construct a narrative is hard-wired into our brains. Stories are essential to how we convey and remember information, and for how we, as both children and adults, learn.

And it turns out that today's A.I. system's aren't all that good at narrative construction. Even some of the most advanced natural language processing software, such as OpenAI's GPT-3, can't tell a coherent story, with a clear beginning, middle, and end, of any significant length. That's partly because narratives depend a lot of implicit and explicit theories of causation. And today's A.I. is very good at figuring out correlation, but generally very bad at understanding causal relationships, including those that exist in language.

A team of four researchers from Georgia Tech, the University of Utah, and Carleton University have mapped a framework for modeling narratives that they hope could be a first step towards teaching computers how to better understand and construct narratives. In a paper, published on the research repository arxiv.org, they look at how narratives construct artificial worlds, move information from a narrator to a reader, and the way that information helps the reader update their beliefs about the world in which the story takes place. The authors write that formalizing an understanding of narrative in this way may enable ideas from a field called model theory, which is already used extensively in computer science, to be applied to narratives.

Interestingly, two members of the research team also identify themselves as members of EleutherAI, a research collective dedicated to open-sourcing A.I. technology. EleutherAI is one of the many groups currently working on creating an open-source version of GPT-3. Perhaps a GPT-3 that can write real narrative will represent their next step.

Subscribe to Well Adjusted, our newsletter full of simple strategies to work smarter and live better, from the Fortune Well team. Sign up today.