• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
ConferencesBrainstorm AI
Asia

AI keeps getting more powerful, making it harder to judge how smart models actually are

Fortune Editors
By
Fortune Editors
Fortune Editors
Down Arrow Button Icon
August 1, 2025, 4:15 AM ET
Russell Wald, executive director at the Stanford Institute for Human-Centered AI, speaks at Fortune Brainstorm AI Singapore on July 23.
Russell Wald, executive director at the Stanford Institute for Human-Centered AI, speaks at Fortune Brainstorm AI Singapore on July 23. Fortune

How do you judge an AI model when it’s already starting to perform better than human beings? That’s the challenge faced by researchers like Russell Wald, executive director of the Stanford Institute for Human-Centered Artificial Intelligence (HAI). 

Recommended Video

“As of 2024, there are very few task categories where human ability surpasses AI, and even in these areas, the performance gap between AI and humans is shrinking rapidly,” Wald said last week in a presentation hosted at the Fortune Brainstorm AI Singapore conference. “AI is exceeding human capabilities and it’s becoming increasingly harder for us to benchmark.”

The HAI releases the AI Index each year, which aims to provide a comprehensive, data-driven snapshot of where AI is today. At Fortune Brainstorm AI Singapore, Wald shared a few highlights from the 2025 edition of the AI index, such as the increasing power of today’s models, the growing dominance of industry on the AI frontier, and how China is poised to overtake the U.S.


The following transcript has been lightly edited for conciseness and clarity.

I’m Russell Wald, the executive director of the Stanford Institute for Human-Centered Artificial Intelligence, or what we call “HAI”. 

We are Stanford University’s globally recognized interdisciplinary research institute at the forefront of shaping AI development for the public good. HAI was established in 2019 with the goal of advancing AI research, education, policy and practice. And, through our convening role and rigorous study of AI, we have become the trusted partner on AI governance for decision makers in industry, government and civil society. 

I’m going to talk about what we produce at HAI, which is the AI index, an annual data driven analysis of trends in AI that tracks research, development, deployment and the socio-economic impact of AI across academia, government and industry.

We see AI performance consistently improve year over year. We use Midjourney, a text-to-image generator, asking for a hyper-realistic image of Harry Potter. And from February 2022 to July 2024, we see rapidly increasing quality in these generated images. 

In 2022, the model produced cartoonish, inaccurate renderings of Harry Potter, but by 2024, it could create startlingly realistic depictions. We have gone from what mirrors a Picasso painting to an uncanny rendering of Daniel Radcliffe, the actor who played Harry Potter in the movies. 

Because of this consistent performance growth, we are increasingly challenged when it comes to benchmarking these models. As of 2024, there are very few task categories where human ability surpasses AI, and even in these areas, the performance gap between AI and humans is shrinking rapidly. From image recognition to competition-level mathematics to PhD-level science questions, AI is exceeding human capabilities and it’s becoming increasingly harder for us to benchmark.

From healthcare to transportation, AI is rapidly moving from the lab to our daily life. In 2023, the U.S. Food and Drug Administration approved 223 AI-enabled medical devices, up from just six in 2015. 

On the roads, self-driving cars are no longer experimental. For example, Waymo, which I regularly take while living in San Francisco, is one of the largest U.S. operators and provides over 150,000 autonomous rides each week, while Baidu’s affordable Apollo Go robotaxi has a fleet now that serves numerous cities across China. 

Business use of AI increased significantly after stagnating from 2017 to 2023. The latest McKinsey report reveals that 78% of surveyed respondents say their organizations have begun to use AI in at least one business function, marking a significant increase from 55% in 2023. 

Driven by increasingly capable small models, the inference cost for a system performing at the level of [GPT 3.5] dropped over 280-fold between November 2022 and October 2024. Hardware costs have declined 30% annually, while energy efficiency has improved by 40% each year. 

Open-weight models are also closing the gap with closed models, reducing the performance [gap] from 8% to just 1.7% on some benchmarks in a single year. Together, these trends are rapidly lowering the barriers to advanced AI. 

However, even with inference and hardware costs going down, training costs remain out of reach for academia and most small players. Nearly 90% of notable AI models in 2024 came from industry, which is up from 60% in 2023. And while academia remains a top source of highly cited research, it does struggle at this point to stay as advanced at the frontier level. 

Model scale continues to grow rapidly. Training compute doubles every five months, datasets every eight, and power use annually. Yet performance gaps are shrinking. The score difference between the top and 10th ranked models fell from 11.9% to 5.4% in a year, and the top two models are now separated by just 0.7%. The frontier is increasingly competitive and increasingly crowded. 

In recent years, AI model performance at the frontier has converged, with multiple providers now offering highly capable models. This marks a shift from late 2022, when ChatGPT’s launch, widely seen as AI’s breakthrough into the public consciousness, coincided with the landscape dominated by just two players: OpenAI and Google. 

One of the most important things to note is that the transformer model cost $930 for Google to train in 2017—and that is the T in GPT, the baseline level of architecture—and now today we’re at $200 million to train Gemini Ultra. 

Last year’s AI index was among the first publications to highlight the lack of standard benchmarks for AI safety and responsibility evaluations. The index has also been analyzing global public opinion. If you are from a non-Western industrialized nation, you are more likely to view AI positively than not. China has an 83% positive view, Indonesia 80%, and Thailand 77%. Whereas Canada is at 40%, the U.S. 39%, and the Netherlands 36%. 

I’ll close with the geopolitical situation. The U.S. still maintains a lead in AI, followed closely by China. However, this gap is tightening. My intention is not to exacerbate the idea of an AI arms race between China and the U.S., but instead to highlight the different approaches between the most advanced frontier AI model developers. 

Over the last several years, the U.S. has relied on a few proprietary model providers. Meanwhile, China has deeply invested in its talent base, and more importantly, an open-source environment. If this trend continues, and I appear next year, at this rate, China would surpass the U.S. in terms of model performance. 

Join us at the Fortune Workplace Innovation Summit May 19–20, 2026, in Atlanta. The next era of workplace innovation is here—and the old playbook is being rewritten. At this exclusive, high-energy event, the world’s most innovative leaders will convene to explore how AI, humanity, and strategy converge to redefine, again, the future of work. Register now.
About the Author
Fortune Editors
By Fortune Editors
See full bioRight Arrow Button Icon

Latest from our Conferences

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Leadership
  • Success
  • Tech
  • Asia
  • Europe
  • Environment
  • Fortune Crypto
  • Health
  • Retail
  • Lifestyle
  • Politics
  • Newsletters
  • Magazine
  • Features
  • Commentary
  • Mpw
  • CEO Initiative
  • Conferences
  • Personal Finance
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map

Latest from our Conferences

InnovationBrainstorm AI
Backflips are easy, stairs are hard: Robots still struggle with simple human movements, experts say
By Nicholas GordonDecember 11, 2025
17 days ago
ConferencesBrainstorm AI
Exelon CEO: The ‘warning lights are on’ for U.S. electric grid resilience and utility prices amid AI demand surge
By Jordan BlumDecember 9, 2025
19 days ago
AIBrainstorm Design
AI’s reliance on patterns can lead to ‘somewhat mediocre’ results, warns CEO of design consultancy IDEO
By Andrew StaplesDecember 9, 2025
19 days ago
Logo of Fortune Brainstorm AI conference
ConferencesBrainstorm AI
Fortune Brainstorm AI 2025 Livestream
By Fortune EditorsDecember 8, 2025
21 days ago
Workplace CultureBrainstorm Design
How two leaders used design thinking and a focus on outcomes to transform two Fortune 500 giants
By Christina PantinDecember 4, 2025
25 days ago
Workplace CultureBrainstorm Design
Designer Kevin Bethune: Bringing ‘disparate disciplines around the table’ is how leaders can ‘problem solve the future’
By Fortune EditorsDecember 3, 2025
25 days ago

Most Popular

placeholder alt text
Future of Work
Malcolm Gladwell tells young people if they want a STEM degree, 'don’t go to Harvard.' You may end up at the bottom of your class and drop out
By Sasha RogelbergDecember 27, 2025
1 day ago
placeholder alt text
Banking
Russian official warns a banking crisis is possible amid nonpayments. 'I don’t want to think about a continuation of the war or an escalation'
By Jason MaDecember 27, 2025
1 day ago
placeholder alt text
Europe
Christmas 500 years ago was a drunken 6-week feast that may have been considerably better than the modern holiday, medieval historian says
By Bobbi Sutherland and The ConversationDecember 25, 2025
3 days ago
placeholder alt text
Politics
Peter Thiel and Larry Page are preparing to flee California in case the state passes a billionaire wealth tax, report says
By Jason MaDecember 27, 2025
1 day ago
placeholder alt text
Success
As millions of Gen Zers face unemployment, CEOs of Amazon, Walmart, and McDonald's say opportunity is still there—if you have the right mindset
By Preston ForeDecember 26, 2025
3 days ago
placeholder alt text
Arts & Entertainment
Gen Zers and millennials flock to so-called analog islands 'because so little of their life feels tangible'
By Michael Liedtke and The Associated PressDecember 28, 2025
9 hours ago

© 2025 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.