There’s an ugly history buried beneath A.I.

September 17, 2021, 6:30 PM UTC

Hollywood gets a failing grade for inclusion, global investors are hot for Latin American start-ups, and we’re all about to get much smarter about race. Bonus track: Our colleague Jonathan Vanian digs into the ugly colonial remnants baked into artificial intelligence.

But first, here’s your happy Hispanic Heritage Month week in review in Haiku.

I wonder what Jim
Crow’s cousin Juan would say if
he were here right now?

Or what it feels like
to celebrate Hispanic
heritage when just

getting rid of Spain
to gain your independence
hasn’t been enough

to be truly seen.
The fight for true inclusion
could use some boosters!

Hey! Drop the taco,
take off that damn sombrero
and be an ally

Wishing you a worthwhile and celebratory weekend.

Ellen McGirt

In brief

Colonialism’s painful legacy can be found in today’s powerful artificial intelligence technologies.

In countries like India, the Philippines, and Kenya, thousands of workers are annotating the data needed to make A.I. understand language, among other tasks. These workers are crucial for improving A.I. so that when you ask Siri or Alexa to tell you the weather, the digital assistants can recognize and react to the sound of your voice.

As anthropologist Mary Gray said in an interview with Fortune, these countries are home to workers who prep data for A.I., such as the so-called language models that give digital assistants their smarts. Gray is an expert in overseas data labelling and examines the overlooked global technology workforce in the book “Ghost Work,” co-authored with her Microsoft Research colleague and computer scientist Siddharth Suri.

These laborers, often employed by consulting firms or companies that specialize in data annotating, are typically well-versed in English. Their understanding of English comes in handy when they manually label audio files with the correct words so that the A.I. learns over time to recognize what people are saying.

Companies typically hire workers in these countries because they are cheaper than their counterparts in the U.S. or Europe. But while there are many countries in the world that have sluggish economies, thus providing access to low-wage workers, not every country has a population that’s likely to read and speak in English in addition to their own native tongue.

Countries like India and the Philippines, however, have long often-ugly histories involving colonial rule by the United Kingdom and the U.S. The same is true for Kenya and many eastern African countries that were under the authoritarian rule of British colonial forces. And life for people living under their colonial overlords was not pleasant; their countries’ modern histories contain massive bloodshed and exploitation of labor that benefited the rulers.

That’s one of the reasons why it’s important for business leaders to recognize the work of skilled data labelers in these countries, Gray explained. Companies often take for granted what data labelers do, thinking that their tasks are a form of grunt work or mundane clerical drudgery. But that viewpoint demeans these workers, who are skilled in English because they have learned the dominant language of the once-ruling class.

These overseas data labelers are not doing mindless tasks. They need an understating of linguistics to properly annotate the data that’s used to train A.I., because if the quality of the labeled data is bad, the A.I. ultimately suffers. Data labelling for A.I. language models can be challenging because of the enormous complexities of human language. With some technologists and business leaders, however, “They're assuming it's easy, because they think there's nothing nuanced about language,” Gray said.

That’s a flawed assumption, and incidents like a Google-developed hate-detection tool that learned to associate African American vernacular with toxicity prove that language is context-dependent.

“What I hope by talking about this is that it puts in front of people [the idea] that things we've dismissed as easy, are in fact, quite nuanced,” Gray said.  

Because the tech supply chain isn’t regulated, the full extent of working conditions for data labelers in places like Africa is still unknown, she said. Some companies have publicized that they are providing quality working conditions for their offshore employees, but there’s still much to learn.

Gray can’t say for sure that there is no slave labor taking place in the overseas data annotation industry because “we don't have any empirical evidence to say it's not.”

“I would love to say ‘No, that's not happening, that's not likely happening,’” Gray said. She noted that there’s been reports of slavery and child labor in the chocolate industry, underscoring the challenges of knowing true working conditions.

Gray hopes that people recognize the worth and humanity of these often-unnoticed data workers. She’s met some very intelligent data labelers who have amassed encyclopedic knowledge of “a range of things” through their work annotating data, which can include photos and videos used to power self-driving cars. The problem is “we don't know exactly how to value that as a form of expertise.”

But she said she’s hopeful, because it’s still “early days” for A.I. and there’s time for people to recognize the value of these data annotators. And that may involve understanding the unique histories of their countries.

“There’s so much working against us seeing the humanity and the contributions that these people are making, quite literally because we can't see them,” Gray said.

Jonathan Vanian 

On point

It’s National Hispanic Heritage Month. Should it be? This thoughtful piece from NPR’s Vanessa Romo explains the history of the month — which starts on September 15 because it coincides with the national independence day of Guatemala, Honduras, El Salvador, Nicaragua and Costa Rica, followed shortly by Mexico, Chile and Belize. But she also poses some thorny questions, while calling out the lack of Hispanic representation in her own newsroom. “What's the harm in lumping together roughly 62 million people with complex identities under a single umbrella? Is a blanket pan-ethnic term necessary to unite and reflect a shared culture that is still largely (infuriatingly) excluded from mainstream popular culture? Or the more basic question: ¿Porque Hispanic?”

Hollywood has mucho mucho work to do A new study from the University of Southern California’s Annenberg Inclusion Initiative, released to coincide with Hispanic History Month, shows that Latinx representation in popular movies remains dismal, under-pacing both the population of the country and the city where most of these movies are made. Worse, when these characters do appear on screen, they’re often stereotypes. Across the 1,300 movies in the study, which included over 50,000 speaking characters, only 5% of all speaking characters were Hispanic/Latinx. That figure has never been higher than 7.2% in 2017.
Huffington Post

Facebook offers new tools for struggling small business owners of color According to Facebook’s latest global survey of small business, Hispanic-led small businesses in the U.S. had the highest rate of closures at 24 percent, and Black-led small businesses were a close second at 22 percent. Overall, the survey shows that businesses in the U.S. run by women or people of color were at least 50 percent more likely to close or report lower sales compared to the same time last year. Facebook has announced a suite of tools to help this population of small businesses called the Facebook Fast Track Program, which will allow them to sell outstanding invoices directly to Facebook for payment. "We didn't find another company that's using its own balance sheet to provide liquidity for other companies' invoices--in this case, small businesses, who really need this right now," Rich Rao, the company’s global head of small business tells Inc.

Latin American start-ups are having a moment  While investors continue to overlook talent in the U.S., global players are heating up the Latin American start-up ecosystem, much of it in tech aimed at the finance, banking, and real estate sectors. But it’s not about disruption, it’s about uplift. “The vast majority of the population is underserved in almost every category of consumption. Similarly, most businesses are underserved by modern software solutions,” says Shu Nyatta who leads Softbank’s Latin American fund. “There’s so much to build for so many people and businesses. In San Francisco, the venture ecosystem makes life a little better for individuals and businesses who are already living in the future. In LatAm, tech entrepreneurs are building the future for everyone else.”

This edition of raceAhead was edited by Wandy Felicita Ortiz.

On Background

We’re all about to get much smarter about race In what is being called an “unusual four-book deal,” a noted historian will be taking on race, identity, slavery, and in some form, our collective political life. Martha S. Jones, a professor at Johns Hopkins, has a track record of meeting the moment with breakthrough scholarship: Her 2019 book Vanguard restored the stories of Black women in the suffragette movement at a time when the anniversary of the 19th Amendment threatened to solidify a whitewashed version of history. She tells the New York Times that because history is personal in unexpected ways, it’s important to view people's reactions with empathy. “If we take off our blinders, it’s no surprise that people coming to this history of racism, slavery and Jim Crow the first time are moved, troubled, even confused,” she said. If you’ve got the time, catch her in this conversation series hosted by Amherst College called The History of Anti-Black Racism in America.
New York Times

Local fake news derails a promising career At 26, Michael Tubbs became the youngest mayor in Stockton, Calif. history. Soon, he became nationally known for his successful experiment in universal basic income. (I’ve just ordered his new memoir, The Deeper The Roots.) But I was not aware that his subsequent political career was eclipsed in part by a website purporting to be a local media outlet that was, in fact, an anti-Tubbs disinformation site. The site was underwritten by a man who would go on to become his political opponent. "I got this tip that Stockton had its own version of ‘Russian trolls,’” says NPR’s Yowei Shaw, in this behind-the-story interview. What she found was a pretty chilling example of what can happen in communities without legitimate news and journalism resources.
Take On Fake with Hari Sreenivasan

Who owns the shadow of an enslaved Black body? Writer Latria Graham excavates the excavators in this poignant piece on the images of enslaved people and why those images deserve to be reconsidered. Her specific tale begins with the images of Delia and Renty, two people who had been enslaved in Columbia, South Carolina, and forced to sit motionless for up to 15 minutes — in Delia’s case, nude from the waist up — to make a series of daguerreotypes. The images were later placed in a proverbial drawer at Harvard, only to emerge a century and a half later in a world where images are easily copied, shared, sold and exploited. Tamara Lanier, who claims to be a descendant of Renty,  has filed suit against Harvard, asking for their return and asking for the profits associated with the use of the images. But larger questions also loom about how images like these are interpreted and why the sinister reason they were made in the first place is not more commonly known.
The Atlantic

Mood board

RaceAhead-West Side Story-Rita Moreno
Rita Moreno in "West Side Story," 1961—we're dancing around the complications of heritage, all over Hollywood.
United Artists/Getty Images

Our mission to make business better is fueled by readers like you. To enjoy unlimited access to our journalism, subscribe today.

Read More

CEO DailyCFO DailyBroadsheetData SheetTerm Sheet