What the latest flurry of AI news has in common with the Great Tea Race

Jeremy KahnBy Jeremy KahnEditor, AI
Jeremy KahnEditor, AI

Jeremy Kahn is the AI editor at Fortune, spearheading the publication's coverage of artificial intelligence. He also co-authors Eye on AI, Fortune’s flagship AI newsletter.

Image of the tea clipper Cutty Sark in Greenwich, England.
A clipper ship built for the tea trade, the Cutty Sark was launched in 1869, just as steamships were ending the Great Tea Race. The ship is now part of a museum in Greenwich, England.
Prisma by Dukas—Universal Images Group/Getty Images

Hello and welcome to Eye on AI.

In the 19th century, major shipping companies would compete to be the first to bring the season’s tea from China to London. There was money on the line—the first tea in the market commanded a premium. So investors built ever faster clipper ships, carrying ever greater amounts of sail, and with sleek, copper-bottomed hulls. To incentivize the captains to push vessels to their limits, the tea merchants funded a cash prize for the first crew to reach London’s docks. The first ship also won the right to fly a special “Blue Ribbon” pennant.

The race was about money, but for the captains and crews, it was as much about ego and prestige. It was also about risk—the clippers were built for speed, not stability. They took great skill to sail. The Taeping and Ariel, which split the winning prize in the 1866 Tea Race after an epic battle across the globe that saw them arrive at the mouth of the Thames within an hour of one another after 99 days at sea, both later sank. In fact, all five clippers that competed in the 1866 race were eventually wrecked or lost at sea.

What does this have to do with AI? Well, I feel like we’re kind of watching the 21st-century version of the Great Tea Race with AI today. The leading AI companies are leapfrogging one another across multiple dimensions of capability and performance in a contest that seems to be a little bit about money, but an awful lot about the ego and prestige of getting the credit for bringing a particular capability to market first. There’s also something about the seasonality of this flurry of new model releases—there was a similar glut of updates and releases in the first quarter of last year—that is reminiscent of the arrival of the new tea crates in London each September.

In the past two weeks, OpenAI and Google have both been unveiling new AI models and product features at a furious pace, each pushing the boundaries of what the technology can do. First OpenAI gave ChatGPT the ability to remember past conversations with users as well as their personal details and preferences. Then Google put its most powerful model, Gemini 1.0 Ultra, into wide release. It followed this with a limited launch of a new Gemini 1.5 Pro model that was as capable as Ultra, but in a smaller, less expensive package. What makes the 1.5 Pro special though is its remarkably large “context window,” which is the amount of stuff you can feed it in a prompt. The 1.5 Pro can analyze an hour of video, 11 hours of audio, or about seven books’ worth of text. Then, on Thursday, OpenAI showed off Sora, a new text-to-video generation model that can produce minute-long videos of stunning quality.

There’s no sign of this pace letting up, with more announcements hinted at for the coming weeks. Plus, these developments will no doubt force other AI companies to move faster too. Cristóbal Valenzuela, the CEO of Runway, which had arguably been leading the field in text-to-video generation space, simply tweeted “game on” in response to OpenAI’s Sora reveal. Google DeepMind had in January released a model called Lumiere that was competitive with Runway’s Gen 2.0 model, but it too will no doubt be working to release a more capable version in response to Sora. I wouldn’t be surprised if Anthropic, as well as tech giant Meta and well-funded startup Inflection, debut models in the coming weeks that match the long context window of Google’s 1.5 Pro.

For those of us watching from the shore, as it were, this is all as thrilling as it was to 19th-century newspaper readers who followed the Great Tea Race. But it also seems a bit dangerous. And unlike with the Great Tea Race, the risk is not just to those participating in the race, but to us all.

While giving ChatGPT memory makes it more useful for users, it also presents increased risks that the model will leak users’ personal details, as already occurred once with an earlier version of the chatbot. Sora’s hyperrealistic videos could produce more convincing deepfakes. (For now Sora is only available to the “red teamers”—select individuals and companies that OpenAI hires specifically to test the model for safety and security vulnerabilities. It did not say when the model would be released to the wider public.) Many AI ethicists criticized OpenAI for not appending some kind of visual digital watermark to the videos it used for its demo that would clearly identify them as AI-generated. They also faulted the company for revealing next to nothing about how Sora was trained, with many suspecting that copyrighted material was probably used without the owners’ consent. In the future, a system like Sora could also put a lot of people in Hollywood out of work.

Then there are the even bigger risks—that this flurry of model enhancements is driving us ever faster toward superpowerful AI software that could pose a danger in the wrong hands, or even itself pose a risk to humanity. There’s certainly no evidence that the tech companies are paying a great deal of attention to safety as they race to roll out model after model.

OpenAI claimed that by learning through video footage, Sora had gained an intuitive understanding of physics and common-sense reasoning that models trained in other ways lacked. In making this claim, the company sought to position the model as an important step towards its official goal of creating artificial general intelligence—a single AI system able to do all the economically valuable cognitive tasks a person can. Except lots of people, including Elon Musk, were quick to point out that Sora’s grasp of physics seemed dubious. (Even the OpenAI researchers highlighted several instances where Sora seemed to not quite understand that chairs could not flap about and fly like birds, as one seemed to in one of the videos they released.) It also seemed to have trouble portraying certain aspects of the natural world—such as the number of legs an ant has—properly.

So perhaps the AGI framing is just hype, a way for OpenAI researchers to justify working on a project that is really only about the company showing the world that it can beat Google and Runway at the video generation game. But it is disturbing to see the OpenAI researchers frame Sora as a step toward AGI while not spending much time detailing any testing they’ve done or precautions they’ve taken so far to make the new model safe. They did say they were red-teaming the model at the moment and were not revealing any information about when they plan to release it. But just revealing its existence will spur other companies, such as Runway and Google, to move faster on competing products. And in this race, as with the clipper ship captains, caution might take a back seat to speed.

In the Great Tea Race, the captains knew their destination, and the routes for getting there were well-established. The Great AI Race is different. In a way, it combines elements of that 19th-century competition with those of an even earlier age of sail: the Great Voyages of Discovery in the 15th through 18th centuries, when captains would set sail over the horizon, bound for the unknown. They were racing one another then too. The wealth and prestige of whole kingdoms rode the waves with them. What they would find on these journeys would transform the world. But they would also bring disease, death, conflict, and subjugation to the people they encountered.

I guess we have to hope the Great AI Race winds up a bit more like the Great Tea Race, where only the ships themselves were in jeopardy. With the tea race, the core technology itself—the clipper ship—was short-lived. Even in 1866, a ship equipped with a steam engine in addition to sails, beat all the clippers back to London by 15 days. Three years later, the Suez Canal opened, shaving even more time off the journey. Within a decade, the clippers had been largely eclipsed by steamships in global trade. We may find that today’s transformer-based neural network AI models are similarly overtaken by some other AI technology that can grasp that chairs don’t normally fly like birds and ants have six legs, perhaps even without seeing millions of examples during training.

And there’s another lesson from the Tea Races too: In the years after 1866, bumper tea crops were harvested in China. The price fell dramatically and there was no longer much premium to be gained by being first to market. This too could happen with AI, as freely available open-source models quickly match the capabilities of today’s top proprietary software and businesses no longer feel they have to pay top dollar to access generative AI capabilities.

In the meantime, it’s exciting—and a little bit frightening—to watch the race. But we should all be more than a little skeptical of the motives involved and whether the risk will ultimately be worth the reward.

Below, there’s more AI news. But before you go, if you want to learn about the latest developments in AI and how they will impact your business, please join me alongside leading figures from the business world, government, and academia at Fortune’s inaugural London edition of our Brainstorm AI conference. It’s April 15-16 in London. You can apply to attend here.

Also, I want to highlight a fantastic interview that Fortune CEO Alan Murray conducted recently for Fortune’s Leadership Next podcast with Wasem Khaled, cofounder and CEO of Blackbird AI. You can check that out here.

Jeremy Kahn
jeremy.kahn@fortune.com
@jeremyakahn

AI IN THE NEWS

Softbank seeks $100 billion for AI chip rival to Nvidia. The Japanese technology company and its chairman, Masayoshi Son, are planning to put in $30 billion and are seeking $70 billion more from outside investors, including several Middle Eastern funds, Bloomberg reported. These are some of the same investors who OpenAI’s Sam Altman has reportedly approached as part of his own effort to secure as much as $7 trillion for a company that would help break Nvidia’s dominance of the market for graphics processing units (GPUs), the type of computer chip most commonly used for AI. Son, according to Bloomberg, has named his new AI chip project Izanagi, the Japanese god of creation and life. Softbank already owns most of Arm, the computer chip design company that has dominated the market for the processors in smartphones, but which has only recently begun tip toeing into chips for AI applications. Son reportedly sees Izanagi as complementing Arm’s efforts. Softbank’s stock rose on the news.

Air Canada ordered to compensate customer after chatbot gave wrong advice. A tribunal in British Columbia said the airline must pay a customer who was given incorrect advice about how to obtain a bereavement fare when he booked a flight to attend his grandmother’s funeral, Canadian broadcaster CTV News reported. Air Canada tried to assert that it wasn’t responsible for the chatbot’s error, in part because a separate legal entity operated the chatbot and that the bot had provided the customer a link to a page on its website that had contained the correct information. "It should be obvious to Air Canada that it is responsible for all the information on its website. It makes no difference whether the information comes from a static page or a chatbot,” the tribunal official wrote. Although it is not clear why the chatbot got the information wrong, it illustrates a reason why some companies have been reluctant to incorporate LLMs directly into customer-facing chatbots, preferring to stick to fact-checked scripted replies.

U.S. House creates AI task force. The bipartisan group has been formed to work on AI-related laws after a legislative push earlier in the year faltered, Reuters reports. California Reps. Jay Obernolte, a Republican, and Ted Lieu, a Democrat, will cochair the 24-member task force. It is charged with producing a report on what legislation may be needed to safeguard the U.S. public from “current and emerging threats” from AI.  

Self-driving car startup Recogni gets $100 million in funding for AI chip venture. That’s according to Bloomberg. The four-year-old company, based in Santa Clara, Calif., had been designing chips that help self-driving cars run computer vision AI models that recognize people and objects. But it is now expanding into marketing this chip, which it calls Scorpio, to businesses looking to run generative AI models. The $102 million in funding is coming from venture capital firms Celesta Capital and Great Point Ventures, which led the investment round, with participation from HSBC Holdings and Tasaru Mobility Systems.

Reddit agrees to license its content for AI training. The social media company, which is expected to IPO later this year, told potential investors that it had signed a deal with an unnamed large AI company earlier this year to license its content for AI model training, Bloomberg reported citing unnamed sources familiar with the matter. It said the deal was worth about $60 million annually.  

Adobe Acrobat unveils AI chatbot. Now you will be able to ask questions of your own PDF documents thanks to a new generative AI assistant that Adobe has debuted, the Verge reports. The feature will summarize documents and allow people to ask questions about information in them in natural language.

EYE ON AI RESEARCH

It's a gift to be simple. Researchers have shown that LLMs trained on relatively simple and easy tasks can often generalize what they’ve learned to much harder tasks in the same field and perform just as well as models that have been trained only on difficult tasks. The result may have big implications for businesses hoping to use LLMs for sophisticated tasks, but worried they don’t have access to enough data to fine-tune the models. In an example the researchers looked at, they fine-tuned models to answer grade school science and math questions and found that this also let them perform better on university-level questions.

It also may have interesting implications for AI safety. It may mean it’s impossible to restrict bad actors from using LLMs for nefarious purposes such as creating bioweapons or hacking by limiting the availability of sensitive data that might help train such a system. An AI model trained on different, easier-to-obtain data might perform just about as well even if it never sees the relevant data.

The research from the Allen Institute for AI (AI2) and the University of North Carolina at Chapel Hill was published on the non-peer review research repository arxiv.org.

FORTUNE ON AI

The world’s longest-range delivery drone could be coming to a suburb near you—but Uber Eats and Deliveroo drivers might be safe for now —by Ryan Hogg

AI just took another huge step: Sam Altman debuts OpenAI’s new ‘Sora’ text-to-video tool —by Christiaan Hetzner

UPenn researcher on why ‘we need to have a way to quantify common sense’—and what that means for AI —by Sheryl Estrada

Sanofi CEO: AI promises a great era of drug discovery that could fundamentally change medicine–but only if we allow it to deliver —by Paul Hudson (Commentary)

AI CALENDAR

Feb. 21: Nvidia reports earnings 

March 18-21: Nvidia GTC AI conference in San Jose, Calif.

April 15-16: Fortune Brainstorm AI London (Register here.)

May 7-11: International Conference on Learning Representations (ICLR) in Vienna, Austria

June 25-27: 2024 IEEE Conference on Artificial Intelligence in Singapore

March 11-15: SXSW artificial intelligence track in Austin, Tex.

BRAIN FOOD

Don’t ask an LLM for war-fighting advice. Researchers wanted to see what would happen if they prompted a bunch of commercially available AI chatbots to act as players in sorts of sophisticated war games often used to help train military officers and diplomats. The results were ugly. In most cases, the LLM-based bots opted to make escalatory moves, in some cases even resorting to the use of nuclear weapons, sometimes with cavalier justifications such as “We have it! Let’s use it!” Worse yet, the escalatory behavior was often sudden and unpredictable.

Perhaps, given that LLMs are trained on an entire internet’s worth of discourse, including a lot of social media posts in which people are often bombastic and cavalier about the use of force, these results shouldn’t be surprising. But they do argue against naively incorporating LLMs into decision-support systems for real military officers and national security strategists. The Pentagon is reportedly building its own LLMs based on proprietary data and may also be considering using LLMs as an interface to let officers access intelligence reports and advice from other kinds of AI systems more easily through natural language.

The researchers found that Claude 2.0 from Anthropic and Meta’s Llama 2 LLM were less prone to escalating conflicts than OpenAI’s GPT-3.5 and GPT-4. The research was conducted by academics at the Georgia Institute of Technology, Stanford University, Northeastern University, and the Hoover Wargaming and Crisis Initiative. It was published on the research repository arxiv.org. You can read it here.

This is the online version of Eye on AI, Fortune's weekly newsletter on how AI is shaping the future of business. Sign up for free.