This is the web version of Eye on A.I., Fortune’s weekly newsletter covering artificial intelligence and business. To get it delivered weekly to your in-box, sign up here.
For the past three years, two London-based investors have compiled an extremely comprehensive summary of the current “State of A.I.” It’s the work of Ian Hogarth, who founded the concert discovery site Songkick and is now a prominent angel investor, and Nathan Benaich, a venture capitalist whose firm Air Street Capital focuses on startups built around applications of artificial intelligence.
This year’s report runs to 177 detailed Powerpoint slides. It’s a great way to take the pulse of the whole field.
The report touches on so many areas that I won’t be able to do it justice. But I will highlight a few things that struck me.
One trend emerging from the report that I touched on in this newsletter back in December: a growing dichotomy between the priorities of A.I. researchers and those of A.I. practitioners who work in other kinds of businesses, such as healthcare and finance.
The research community wants to push the boundaries of what A.I. can do. Benaich calls this group the “big-model world.” And for good reason. Many of the A.I. systems that are currently at the bleeding edge are truly gargantuan. Training a model that has hundreds of billions of parameters—as OpenAI’s GPT-3 language model does—takes mind-boggling amounts of computing power and costs millions of dollars.
Benaich and Hogarth question whether that is sustainable. “We are rapidly approaching outrageous computational, economic and environmental costs to gain incrementally smaller improvements in model performance,” the two write. They note that many machine learning researchers feel that progress in the field has stagnated and that fundamentally different approaches may be needed to bring us much closer to artificial general intelligence—systems that can perform many different kinds of tasks at human or super-human levels.
With the exception of the world’s tech giants, most companies can’t afford to live in “big-model world.” If more sophisticated A.I. actually depends on building larger and larger models, then “only a small number of actors will be able to compete,” Hogarth tells me. And while the likes of Google, Microsoft and IBM hope to sell their big models, pre-trained and pre-built, to customers of their cloud computing services, many businesses are reluctant to adopt giant pre-trained A.I. software because they don’t have enough insight into how it’s trained and how it’ll perform. Companies fear that by adopting it they may be inadvertently stepping into an ethical, reputational or regulatory morass.
Most businesses live on a different A.I. planet. Benaich calls this “the task-specific A.I. world.” These folks are looking to build A.I. systems that perform one highly specialized task exceptionally well. In building this task-specific software, even startups can compete. Benaich, for instance, points to a young London company called PolyAI that he’s invested in. It has built a chat bot-like conversational A.I. system that outperforms many of the larger language models, such as Google’s BERT, but is a fraction of the size of most cutting-edge NLP systems. (PolyAI’s system took in 59 million parameters compared to BERT, which even in its lightweight version uses 110 million parameters.) This allows PolyAI’s software to be trained on just a dozen GPUs—the graphics processing chips that have become the workhorses of A.I. computing—in a single day.
Benaich and Hogarth also have a nice slide deep in the report showing that 25% of the fastest-growing Github projects in the second half of 2020 were for “machine-learning operations” (MLOps), or the engineering nitty-gritty that lets companies deploy, run and maintain A.I. software over the long haul. MLOps is now trending as a Google search term for the first time. This, Benaich and Hogarth write, “signals an industry shift from technology R&D (how to build models) to operations (how to run models).”
Here are some other key takeaways from “The State of A.I.”:
- A.I. in healthcare and medicine is booming. Research on applying A.I. methods to various biology topics has grown 50% each year since 2017. The application of computer vision to medical imagery is having a massive impact in everything from ophthalmology to mammography. Advanced A.I. techniques are also making major inroads into drug discovery and drug research.
- Privacy-preserving machine learning is going to be huge. Interest in “federated learning,” which is one of the most promising techniques for allowing different parties to learn from the same data without compromising privacy, has exploded: More than 1,000 research papers on the topic were published in just the first six months of 2020, compared to just 180 in all of 2018. A major consortium of German hospitals along with Imperial College in London are testing one such system for sharing pediatric chest X-ray data.
- Demand for A.I. talent continues to far outstrip supply, despite a drop-off in job postings due to the Covid-19 pandemic. There are currently three times more job listings for A.I.-related expertise than there are people looking at such job postings, and the rate of job postings has accelerated 12 times faster than job views in recent years. This is true despite the fact that universities are now churning out many more graduates with machine-learning skills. Stanford University, for example, is now training twice as many students in NLP per year as it was between 2012 and 2014.
- The U.S. remains the best place in the world for A.I. talent—but a lot of that talent is foreign-born. American institutions and companies dominate the research output at top A.I. conferences. But the majority of top A.I. researchers working in the U.S. are not American—27% come from China, 11% from India and 11% from Europe. Less than a third—31%—are U.S.-born. The U.S. is also still doing a good job of holding onto foreign-born talent that comes to U.S. universities—85% of all international PhD students and 88% of Chinese PhD students, stay in the U.S. to work after graduation. But all of this is threatened by the Trump Administration’s restrictive visa policies.
- Regulators are finally starting to scrutinize the use of A.I. The focus right now is on facial recognition technology, with a growing number of laws coming into effect in U.S. states and around the world limiting its use. Regulatory pressure is starting to build on algorithmic decision-making in many other contexts too, including banking and insurance.
- The U.S. military is increasingly experimenting with cutting-edge A.I. techniques and incorporating A.I. into its arsenal. This has created a huge opportunity not only for established defense contractors, but also for a host of venture capital-backed startups that are selling the Pentagon everything from autonomous drones to intelligence analysis software to systems that can automatically detect and disrupt electronic communications.
Benaich and Hogarth always make a few predictions for the coming year. Last year, they got four of six predictions right. Here are three of their eight for next year:
- Nvidia will not be able to complete its acquisition of U.K. chip design company Arm.
- Someone will build a 10 trillion parameter A.I. language model.
- One of the companies using A.I. for drug discovery will either be acquired or have an IPO in a deal that values it at over $1 billion.
We’ll check in next year to see if they’re right. Meanwhile, here’s the rest of this week’s A.I. news.
A.I. IN THE NEWS
Amazon's warehouse robots may be increasing employee injury risks. That is one of the conclusions of an investigation into rising injury rates at Amazon's warehouses conducted by the website Reveal, which is run by the Center for Investigative Reporting. The report points to data that backs up "the accounts of Amazon warehouse workers and former safety professionals who say the company has used the robots to ratchet up production quotas to the point that humans can’t keep up without hurting themselves. For each of the past four years, injury rates have been significantly higher at Amazon’s robotic warehouses than at its traditional sites." Amazon insists its robots make warehouse jobs "better and safer" for human workers.
A.I.-powered exam proctoring software doesn't work well for students of color. Black and Latinx students have complained that software used by ExamSoft to proctor the New York State bar exam includes a critical facial-detection feature that doesn't work well for non-white students. This has been a persistent problem with facial-recognition software, much of which has not been trained on enough images of non-white people. It was one of just a host of problems with "automated proctoring" software from a variety of vendors chronicled in a revealing New York Times story last week. As a whole, the software seems to perform poorly in a lot of real-world conditions, and it has features that students found distracting or that increased their test anxiety.
Toyota is working on an A.I.-enabled household cleaning robot. Cleaning robots have been making big inroads in commercial settings during the pandemic. Now comes news from Wired that Toyota is researching a multi-armed robot that could be suspended from a home's ceiling and be used to clean surfaces, including countertops and cupboard doors, as well to put away cutlery and dishes. "Toyota does not have a timeline for commercializing its prototypes, but it is looking to be an early entrant in a potentially big market," according to the story.
IBM joins a government project to improve schizophrenia diagnoses. Big Blue is joining a $99 million five-year effort funded by the U.S. National Institute of Health and involving researchers from Harvard Medical School, Mt. Sinai School of Medicine, Stanford University and the Northern California Institute for Research and Education that is designed to find biomarkers for schizophrenia that could lead to earlier diagnosis of the mental illness. The company will lend its expertise to A.I.-enabled analysis of brain imagery and to natural language processing that might be able to detect changes in how those who are developing schizophrenia speak or in the kinds of symptoms they describe, according to a blog from IBM.
EYE ON A.I. TALENT
New York-based digital publisher G/O Media, whose titles include The Onion, Gizmodo, Jezebel, Lifehacker and The Root, has named Michael Dugan as its chief technology officer, the company said. Most recently Dugan was CTO at Hearst Magazines.
U.K.-based semiconductor design company Imagination Technologies Ltd., which supplies computer chip designs to Apple among other customers, has named Simon Beresford-Wylie as its new chief executive officer, eeNews Europe reported. Beresford-Wylie was previously CEO at Arqiva, a UK provider of communications, broadcast and media services.
Insitro, a San Francisco startup using A.I. in drug discovery, has named Roger Perlmutter, the head of R&D at pharmaceutical company Merck, to a seat on its board of directors, according to a report in health website Stat News.
The Patrick J. McGovern Foundation has named Vilas Dhar as its president, according to a report in trade publication Ai Authority. The Foundation, created with a $1.2 billion endowment left by the late Patrick McGovern, the founder of the publisher of Computerworld, Macworld and PCWorld magazines, is dedicated to ways to use A.I. and data science to improve society. Dhar previously founded two social impact organizations and served as the Gleitsman Fellow on Social Change at Harvard University.
Deep Genomics, an A.I.-focused therapeutics company based in Toronto, has appointed Dr. Ferdinand Massari as its chief medical officer, the company said. Massari will be based in Boston and oversee clinical development of the company's products. Massari is a veteran of several major pharmaceuticals firms, having held senior roles at Pfizer, Pharmacia and Merck. Most recently, he co-founded Kintai Therapeutics.
EYE ON A.I. RESEARCH
This past week brought two potentially significant breakthroughs in Transformers, one of the most, well, transformative neural network architectures of the past five years.
- A.I. researchers have been buzzing about a paper posted to the research repository Openreview.net. Entitled "An image is worth 16 x 16 words: Transformers for Image Recognition at Scale," the paper is currently under review for the International Conference on Learning Representations (ICLR) in May, so its authors are currently anonymous. But many are saying it is likely to revolutionize the field.
In essence, the paper says that convolutions are unnecessary. Those are the mathematical filtering technique whose use in deep learning for computer vision applications was pioneered by Yann LeCun in the 1980s and which has underpinned almost all state-of-the-art computer vision results for the past eight years. The paper says that simply using a Transformer can produce results that beat the two state-of-the-art convolutional neural networks (CNNs), both of which are less than a year old. Oriol Vinyals, a well-known researcher at DeepMind who was part of the team that used Transformers in its breakthrough work on Starcraft, tweeted an exchange he had with Ilya Sutskever, the OpenAI chief scientist, who as a graduate student was among the team that originally used a CNN to crack the ImageNet computer vision benchmark back in 2012.
Sutskever asked Vinyals for his take on the new paper. Vinyals replied, "my take is: farewell convolutions : )"
- Transformer-based architectures tend to take up large amounts of computing power to train and can be unwieldy to deploy. But researchers from DeepMind, its sister company, Google, University of Cambridge, and the Alan Turing Institute have now proposed a new kind of Transformer that is much less computing-intensive to train, according to a story in publication Synced. (The research paper itself can be accessed on the free research repository arxiv.org here.)
Called a Performer, it is based on a mechanism called FAVOR+ (short for Fast Attention Via positive Orthogonal Random features.) "Leveraging detailed mathematical theorems, the paper demonstrates that rather than relying solely on computational resources to boost performance, it is also possible to develop improved and efficient Transformer architectures that have significantly lower energy consumption," the Synced writer said. The researchers tested their new Performer architecture on a number of difficult benchmarks, including one that involved predicting protein-folding sequences, and found it performed better than two existing state-of-the-art Transformer models. On a test on the ImageNet64 computer vision benchmark, a Performer with just six layers matched the accuracy of an existing Transformer model with 12 layers and was twice as fast.
Progress like this may mitigate the research-business divide I explored in the opening of the newsletter. If massive models can be slimmed down to achieve the same performance at a fraction of the computing load and time, they could be adopted much more widely in industry.
FORTUNE ON A.I.
What you need to know about Nvidia and VMware’s big new A.I. deal—by Jonathan Vanian
Obama: Social media is isolating and dividing Americans—by Danielle Abril
How A.I. is playing a bigger role in music streaming than you ever imagined—by Jonathan Vanian
A.I. gets down in the dirt as precision agriculture takes off—by Aaron Pressman
These deepfake videos of Putin and Kim have gone viral—by Jeremy Kahn
Deepfakes enter the 2020 Election in big way—but it's not what you might fear. The U.S. anti-corruption and good governance group RepresentUS created an ad campaign that featured deepfake videos of Russian President Vladimir Putin and North Korean leader Kim Jong-un expounding on how U.S. democracy is collapsing without them even having to resort to disinformation tactics such as, um, deepfakes. I covered the story for Fortune here.
Meanwhile, Florida Democratic Congressional candidate Phil Ehr's campaign created a deepfake video of his opponent, incumbent Congressman Matt Gaetz, in which the fake Gaetz says "Fox News sucks," "QAnon isn't real," and that's he's endorsing Joe Biden for President. Ehr says he created the ad to highlight Gaetz's failure to take Russian election meddling and disinformation seriously. You can see the ad campaign here.
Plus, the nonprofit group Change the Ref created a deepfake video in which Joaquin Oliver, who was killed in the Parkland school shooting in 2018, is seemingly brought back to life by his grieving parents in order to encourage other young people to vote.
A number of A.I. ethics and security experts are dismayed at the trend: Alex Stamos, the former Facebook security chief who is now a professor at Stanford, tweeted in response to the Ehr ad that "Democrats should not normalize manipulated media in political campaigns."
It may be too late for that. A growing number of advertising agencies are offering deepfakes to clients as part of their arsenal of marketing techniques. Hollywood is looking to the technology to potentially replace traditional computer-generated images (CGI), which are much more labor-intensive, time-consuming and expensive to produce. Some have even speculated that deepfakes may one day allow entire films to be created without the need for any human actors at all. Today, though, deepfakes are not so perfect that they are completely believable. In fact, in both the "Dictators" ad campaign and the Joaquin Oliver video, more traditional CGI effects were used to touch up the deepfake-generated video frames and make them more believable.
The problem with the growing use of deepfakes may not be that they will make political disinformation more believable. It may be the opposite—that they will make everything else, including legitimate video footage, unbelievable. For most of human history, the phrase "seeing is believing" has more or less held true. The advent of software like Photoshop began to undermine that, at least for still images. Deepfakes threaten to knock out its final foundations. We may plunge ever more into a world of fragmented realities—where each person sees what they want to see and believes what they want to believe, including conspiracy theories or whatever propaganda happens to mesh with their worldview.
I was struck by a line in a New York Times story from over the weekend. The story, written by Emily Flitter, was really a profile of a town—Portsmouth, Ohio—that voted for Trump in 2016. But embedded within the piece are a number of mini-profiles of the townsfolk. And one centers two friends who have drifted apart, in large part because one of them, Eli Eaton, has a "view of the world [that] is defined by a web of conspiracy theories. He said believes that the Sept. 11, 2001, terrorist attacks were somehow faked and that the police officer who knelt on George Floyd’s neck is not the same man who was charged with Mr. Floyd’s murder ('look at the ear structure in particular,' he advised)."
Look at the ear structure? That kind of armchair video forensic analysis is only going to become more and more common in an era of ubiquitous deepfakes. The technology will likely add a dash of plausibility to countless conspiracy theories—and when it comes to conspiracy theories, a dash is all that's needed.