Google’s ouster of a top A.I. researcher may have come down to this
Our mission to make business better is fueled by readers like you. To enjoy unlimited access to our journalism, subscribe today.
The recent departure of a respected Google artificial intelligence researcher has raised questions about whether the company was trying to conceal ethical concerns over a key piece of A.I. technology.
The departure of the researcher, Timnit Gebru, came after Google had asked her to withdraw a research paper she had coauthored about the ethics of large language models. These models, created by sifting through huge libraries of text, help create search engines and digital assistants that can better understand and respond to users.
Google has declined to comment about Gebru’s departure, but it has referred reporters to an email to staff written by Jeff Dean, the senior vice president in charge of Google’s A.I. research division, that was leaked to the tech newsletter Platformer. In the email Dean said that the study in question, which Gebru had coauthored with four other Google scientists and a University of Washington researcher, didn’t meet the company’s standards.
That position, however, has been disputed by both Gebru and members of the A.I. ethics team she formerly co-led.
More than 5,300 people, including over 2,200 Google employees, have now signed an open letter protesting Google’s treatment of Gebru and demanding that the company explain itself.
On Wednesday, Sundar Pichai, Google’s chief executive officer, told staff he would investigate the circumstances under which Gebru left the company and would work to restore trust, according to a report from news service Axios, which obtained Pichai’s memo to Google employees.
But why might Google have been particularly upset with Gebru and her coauthors questioning the ethics of large language models? Well, as it turns out, Google has quite a lot invested in the success of this particular technology.
Beneath the hood of all large language models is a special kind of neural network, A.I. software loosely based on the human brain, that was pioneered by Google researchers in 2017. Called a Transformer, it has since been adopted industrywide for a variety of different uses in both language and vision tasks.
The statistical models that these large language algorithms build are enormous, taking in hundreds of millions, or even hundreds of billions, of variables. In this way, they get very good at being able to accurately predict a missing word in a sentence. But it turns out that along the way, they pick up other skills too, like being able to answer questions about a text, summarize key facts about a document, or figure out which pronoun refers to which person in a passage. These things sound simple, but previous language software had to be trained specifically for each one of these skills, and even then it often wasn’t that good.
The biggest of these large language models can do some other nifty things as well: GPT-3, a large language model created by San Francisco A.I. company OpenAI, encompasses some 175 billion variables and can write long passages of coherent text from a simple human prompt. So imagine writing just a headline and a first sentence for a blog post with GPT-3 then composing the rest. OpenAI has licensed GPT-3 to a number of technology startups, plus Microsoft, to power their own services, which include one company’s using the software to enable users to generate full emails from just a few bullet points.
Google has its own large language model, called BERT, that it has used to help power search results in several languages including English. Other companies are also using BERT to build their own language processing software.
BERT is optimized to run on Google’s own specialized A.I. computer processors, available exclusively to customers of its cloud computing service. So Google has a clear commercial incentive to encourage companies to use BERT. And, in general, all of the cloud computing providers are happy with the current trend toward large language models, because if a company wants to train and run one of its own, it must rent a lot of cloud computing time.
For instance, one study last year estimated that training BERT on Google’s cloud costs about $7,000. Sam Altman, the CEO of OpenAI, meanwhile, has implied that it cost many millions to train GPT-3.
And while the market for these large so-called Transformer language models is relatively small at the moment, it is poised to explode, according to Kjell Carlsson, an analyst at technology research firm Forrester. “Of all the recent A.I. developments, these large Transformer networks are the ones that are most important to the future of A.I. at the moment,” he says.
One reason is that the large language models make it far easier to build language processing tools, almost right out of the box. “With just a little bit of fine-tuning, you can have customized chatbots for everything and anything,” Carlsson says. More than that, the pretrained large language models can help write software, summarize text, or create frequently asked questions with their answers, he notes.
A widely cited 2017 report from market research firm Tractica forecast that NLP (natural language processing) software of all kinds would be a $22.3 billion annual market by 2025. And that analysis was made before large language models such as BERT and GPT-3 arrived on the scene. So this is the market opportunity that Gebru’s research criticized.
What exactly did Gebru and her colleagues say was wrong with large language models? Well, lots. For one thing, because they are trained on huge corpora of existing text, the systems tend to bake in a lot of existing human bias, particularly about gender and race. What’s more, the paper’s coauthors said, the models are so large and take in so much data, they are extremely difficult to audit and test, so some of this bias may go undetected.
The paper also pointed to the adverse environmental impact, in terms of carbon footprint, that training and running such large language models on electricity-hungry servers can have. It noted that BERT, Google’s own language model, produced, by one estimate, about 1,438 pounds of carbon dioxide, or about the amount of a roundtrip flight from New York to San Francisco.
The research also looked at the fact that money and effort spent on building ever larger language models took away from efforts to build systems that might actually “understand” language and learn more efficiently, in the way humans do.
Many of the criticisms of large language models made in the paper have been made previously. The Allen Institute for AI had published a paper looking at racist and biased language produced by GPT-2, the forerunner system to GPT-3.
In fact, the paper from OpenAI itself on GPT-3, which won an award for “best paper” at this year’s Neural Information Processing Systems Conference (NeurIPS), one of the A.I. research field’s most prestigious conferences, contained a meaty section outlining some of the same potential problems with bias and environmental harm that Gebru and her coauthors highlighted.
OpenAI, arguably, has as much—if not more—financial incentive to sugarcoat any faults in GPT-3. After all, GPT-3 is literally OpenAI’s only commercial product at the moment. Google was making hundreds of billions of dollars just fine before BERT came along.
But then again, OpenAI still functions more like a tech startup than the megacorporation that Google’s become. It may simply be that large corporations are, by their very nature, allergic to paying big salaries to people to publicly criticize their own technology and potentially jeopardize billion-dollar market opportunities.
This story has been updated to include reports that Google CEO Sundar Pichai has promised to investigate Gebru’s departure from the company.