DeepSeek, AI agents, and avoiding a tech-created catastrophe dominated the talk at Davos

Hello and welcome to Eye on AI. In this edition…AI takes from the World Economic Forum in Davos; DeepSeek changes everything; Trump signs an executive order on AI; and AI auditing is poised to be big.

The financial markets have been roiled this week by the rave reviews Chinese AI startup DeepSeek’s latest model received over the weekend as AI researchers had more of a chance to play around with it.

Many investors believe the new technology upends several key assumptions about AI:

That the U.S. leads China on AI development
That proprietary models have a slight edge on open source ones
That AI progress depends on access to huge numbers of the most advanced AI chips in data centers with metropolis-size energy demands

I happen to think the markets are probably being overly negative about what DeepSeek means for companies like Nvidia in particular, and I wrote about that here. My Fortune colleagues also covered just about every angle of the DeepSeek news yesterday, and I will highlight more of their coverage below.

I spent last week at the World Economic Forum in Davos, Switzerland, where you couldn’t walk more than two feet without seeing or hearing “AI.” Then there was the big AI news that bracketed the week: Donald Trump’s announcement of the Stargate project—the $500 billion data center building spree involving OpenAI—and the buzz around DeepSeek.

Here I’ll try to bring you some of the other highlights, both from panel discussions and one-on-one conversations I had.

Agents everywhere

Everyone is getting excited about AI agents. Salesforce CEO Marc Benioff and his team are the most totemic examples. Benioff told everyone he almost renamed the entire company Agentforce, he’s so excited about AI agents. Salesforce has tried to make it easy for companies to spin up simple agents to automate all sorts of tasks. Adam Evans, Salesforce’s EVP for its AI platform business, told me London’s Heathrow Airport, the world’s second busiest, has been using Salesforce agents to orchestrate tasks—including gate changes, and running software that helps travelers navigate the airport. And within Salesforce itself, Evans says the use of agents to help with customer service means that 83% of customer queries—of which the company receives 40,000 weekly—can now be successfully resolved without involving a human customer service rep.

And it isn’t just Salesforce. Rodrigo Liang, CEO of AI chip startup SambaNova, told me agents are “about chaining together many of these models to create complete workflows.” This transition should be good for SambaNova’s business, Liang said, because the computer chips it’s building are optimized for running trained AI models—what is known as inference—and they can do that faster and using less power than Nvidia’s GPUs. (The company claims it can run some workloads 100-times faster while also consuming one-tenth the power.) That speed advantage, he says, matters more and more with agents—if each model in a workflow takes two seconds to return an output, but it takes 10 models chained together to complete the overall workflow, that means it will take 20 seconds—which is too long for many use cases, such as customer service responses.

Jevons Paradox is the buzzword of the day

I also had a fascinating conversation with Jonathan Ross, CEO of chip startup Groq, which, like SambaNova, is targeting AI inference tasks. Ross told me his company plans to ship at least 400,000 of its chips this year—and perhaps, if all goes according to plan, as many as 2 million. He thinks that new reasoning models—whether it is DeepSeek’s R1 or OpenAI’s o1 and o3—that require more computing resources for inference to produce the best answers, will provide powerful tailwinds to implement Groq’s chips. (Groq claims an 18x speed up in performance, with power consumption between one tenth and a third of what Nvidia’s GPUs consume.) As the cost of reasoning comes down—thanks in part to innovations such as DeepSeek—Ross also sees businesses deploying more and more AI agents.

Like apparently everyone these days, Ross mentioned Jevons Paradox—the idea that as technology makes a resource-consuming process more efficient, overall consumption of that resource goes up, not down. In this case, he predicts that efficiencies in running top AI models, whether due to model innovations like DeepSeek’s or hardware ones from companies like Groq, will mean companies will start deploying AI in more places, ultimately requiring more total computing resources.

A push to think about AI risks—even as Trump scraps the Biden executive order

But moving to a world of AI agents also poses distinct risks. In a striking panel on international governance of AI, deep learning pioneer Yoshua Bengio argued that the biggest risks of catastrophic harm, including perhaps even existential risks to humanity, come from giving AI models agency. It is only when AI systems can use digital tools to take actions in the real world that they potentially pose a threat to human life. What’s more, Bengio argued, agency isn’t necessary to reap many of the benefits from AI. The AI models that can discover new life-saving drugs or materials to create better batteries or biodegradable plastics don’t require agency.

Demis Hassabis, CEO of Alphabet-owned Google DeepMind, basically agreed, saying “the agentic era is a threshold moment for AI becoming more dangerous.” But he then told Bengio it was simply too late to hope that people would eschew developing agents. “It would have been good to have had a decade or more of [non-agentic, narrow AI systems aimed at solving particular science problems] coming out while giving us time to understand these general algorithms better, but it hasn’t worked out that way,” Hassabis said.

Both Hassabis and Bengio urged the global business and political leaders at Davos to continue trying to develop an international governance regime that would impose some safety controls around the development of super powerful AI systems. But their plea came just days after President Donald Trump abolished his predecessor’s executive order on AI, which had been America’s primary effort to contain any potentially catastrophic AI risks.

Training—including some coding skills—matters

At a Fortune-hosted dinner, AI luminary Andrew Ng suggested that in order for businesses to achieve success with AI, their workforces needed better training in how to use AI tools safely and effectively. He said that to find the best return on investment from AI, it was more important to think in terms of specific tasks that AI could help automate, rather than trying to think about entire jobs. He also told me that while genAI models are excellent at writing code, to get the most out of them, it’s a big help if the people using them understand at least a little bit about how to code themselves. That’s why he thinks that even if we move into a world in which AI does a lot of coding for us, students should still be taught coding.

With that, here’s more AI news.

Jeremy Kahn
jeremy.kahn@fortune.com
@jeremyakahn

AI IN THE NEWS

China’s DeepSeek R1 model changes everything. The big news this past week was the impact DeepSeek’s R1 is having on perceptions of who is ahead and who is behind in the AI race. That includes questions about the value of AI infrastructure plays, such as Nvidia, or even major energy companies supplying power to data centers, such as Constellation Energy. For a primer on what DeepSeek is and what’s so special about R1, you can turn to my colleague Nicholas Gordon’s excellent explainer here. For more on the market reaction, Fortune’s Greg McKenna had this piece. I made the case here that DeepSeek is less threatening to Nvidia, and to U.S. export controls, than many people are assuming. My colleague Sharon Goldman explained what DeepSeek may mean for the open source vs. proprietary AI model debate, while Fortune’s Marco Quiroz-Gutierrez looked at how DeepSeek is setting off a panic within Meta. Meanwhile, my colleague David Meyer looked at how DeepSeek’s R1 could wind up exporting Chinese censorship. And that’s just some of Fortune’s DeepSeek coverage. I encourage you all to check out more of it here.

Indian billionaire Mukhesh Ambani plans a giant data center for AI. Ambani’s Reliance Group announced construction of what could be the world’s largest data center in Jamnagar, India, Bloomberg reports. Its planned capacity would be three gigawatts, dwarfing any existing facilities (but smaller than some of the data centers planned under OpenAI’s, Softbank’s, and Oracle’s Project Stargate). The project will deploy Nvidia GPUs and aims to deliver low-cost AI inferencing. Reliance plans to power the facility largely with renewable energy, although it may require some fossil fuel-based power as well.

Trump issues new AI executive order. President Trump signed a new executive order on AI that commits the U.S. to “maintaining and enhancing its leadership” in the technology. It gives AI and crypto czar David Sacks, National Security Advisor Michael Walz, and newly appointed head of the White House Office of Science and Technology Michael Kratsios 180 days to formulate an AI action plan and to revise “federal policies to remove barriers to AI leadership, ensuring U.S. systems remain free from ideological bias.”

EYE ON AI RESEARCH

What’s so innovative about DeepSeek R1? A few weeks ago, we looked at some of the innovations behind DeepSeek’s V3 model. Today, we’ll look at some of the innovations that DeepSeek is reporting with its R1 reasoning model. One of the biggest advances is that DeepSeek trained one version of the model—DeepSeek called it R1-Zero—using only reinforcement learning on math and coding questions, without initially giving it any supervised examples of chain of thought reasoning to learn from, and without giving it any interim rewards about the steps in its process or using another AI model to assess its answers. DeepSeek found that chain of thought reasoning simply emerged from the model if it were trained in this way. That’s a big deal.

But then DeepSeek also found that it could improve the model—and make it so it could reason about things other than math and coding—by first fine-tuning a version of V3 with examples of chain-of-thought reasoning and also by providing a reward for language consistency as well as getting the answer to the question right. Once this version of R1 was getting math and coding questions right with chain-of-thought, it also asked it questions from other knowledge domains and used human curation to pick good answers. This curated dataset was then used to further train the model (using a supervised learning process overseen by V3 itself). In total, the company collected 600,000 examples of reasoning questions and good chain of thought answers, and 200,000 examples of good answers that did not require reasoning.

Finally, the remarkable thing is that DeepSeek showed that this 800,000 example training set could be used to refine almost any general large language model, such as Meta’s Llama-3.1-8B and Llama-3.3-70B-Instruct models, to function as reasoning models. In essence, DeepSeek showed that any model can become a reasoning model after training on fewer than 1 million samples of well-curated data. And that, as Anthropic’s Jack Clark has written in his “Import AI” newsletter, has ramifications not just for business, but also for AI policy, since it means it will be extremely difficult to restrict the proliferation of models with potentially quite potent reasoning skills. You can read DeepSeek’s R1 technical report here and Clark’s thoughts on the model here.

FORTUNE ON AI

Trump calls DeepSeek a ‘wake-up call’ for U.S. tech and welcomes China’s AI gains—by Beatrice Nolan

DeepSeek isn’t China’s only new AI model, and analysts are calling the flurry of new applications a ‘coordinated psyops—by Lionel Lim

Another OpenAI researcher quits—claims AI labs are taking a ‘very risky gamble’ with humanity amid the race toward AGI—by Beatrice Nolan

The lessons and laments of DeepSeek, according to VCs—by Allie Garfinkle

Commentary: China just redefined the global AI race—with massive implications for OpenAI, Nvidia, and foreign policy—by Gary Marcus

AI CALENDAR

Feb. 10-11: AI Action Summit, Paris, France

March 3-6: MWC, Barcelona

March 7-15: SXSW, Austin

March 10-13: Human [X] conference, Las Vegas

March 17-20: Nvidia GTC, San Jose

April 9-11: Google Cloud Next, Las Vegas

May 6-7: Fortune Brainstorm AI London

BRAIN FOOD

AI auditing and assurance is about to be a big business. I had a couple of conversations in Davos last week that convinced me that no matter what happens with AI regulation in the U.S., auditing companies’ responsible AI and AI governance policies is about to be big business.

One reason is the European Union’s AI Act, which requires that those using AI in high-risk use cases conduct a risk assessment and put in place risk mitigations. The other is the rollout of ISO 42001, the first ISO standard for AI management systems. Both require that independent, expert parties certify compliance.

One company that has been quick to offer these services is Schellman, an ISO certification company that made a name for itself in cybersecurity assurance and is now moving quickly in AI compliance services too. Its CEO Avani Desai tells me that many of the assurance standards are really around process. Does a company have a procedure for identifying and dealing with AI hallucinations, for instance rather than about providing hard-and-fast technical assurance that an AI system won’t hallucinate? Having that process is good. Of course, actually trying to reduce how often a model hallucinates might be better.

But either way, look for more and more companies like Schellman to jump into the AI assurance game in the coming year.

This is the online version of Eye on AI, Fortune's biweekly newsletter on how AI is shaping the future of business. Sign up for free.