• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia

Trendingnow

1

MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year

2

Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster

3

Elon Musk on MacKenzie Scott giving away $26 billion of her fortune: 'Sadly,' it makes the world a worse place

1

MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year

2

Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster

3

Elon Musk on MacKenzie Scott giving away $26 billion of her fortune: 'Sadly,' it makes the world a worse place
ConferencesBrainstorm AI

It’s getting harder to tell which company is winning the AI race, Hugging Face co-founder says

By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
May 7, 2025, 5:49 AM ET
Cofounder and Chief Science Officer Hugging Face Thomas Wolf, on stage at Brainstorm AI.
An early pioneer of large language models, Hugging Face is best known for its vast repository of open-source and “open-weight” AI models.Fortune
Add Fortune on Google for similar content.
  • Hugging Face’s Thomas Wolf says that it’s getting harder to tell which AI model is the best as traditional AI benchmarks become saturated. Going forward, Wolfe said the AI industry could rely on two new benchmarking approaches—agency‑based and use‑case‑specific.

Thomas Wolf, co‑founder and chief scientist at Hugging Face, thinks we may need new ways to measure AI models.

Recommended Video

Wolf told the audience at Brainstorm AI in London that as AI models get more advanced, it’s becoming increasingly difficult to tell which one is performing the best.

“It’s getting hard to tell what the best model is,” he said, pointing to the nominal differences between recent releases from OpenAI and Google. “They all seem to be, actually, very close.”

“The world of benchmarks has evolved a lot. We used to have this very academic benchmark that we mostly measured the knowledge of the model on—I think the most famous was MMLU (Massive Multitask Language Understanding), which was basically a set of graduate‑level or PhD‑level questions that the model had to answer,” he said. “These benchmarks are mostly all saturated right now.”

Over the past year, there has been a growing chorus of voices from academia, industry, and policy claiming that common AI benchmarks, such as MMLU, GLUE, and HellaSwag, have reached saturation, can be gamed, and no longer reflect real‑world utility.

In a study published in February, researchers at the European Commission’s Joint Research Centre, published a paper called “Can We Trust AI Benchmarks? An Interdisciplinary Review of Current Issues in AI Evaluation” that found “systemic flaws in current benchmarking practices”—including misaligned incentives, construct‑validity failures, gaming of results and data‑contamination.

Going forward, Wolf said the AI industry should rely on two main types of benchmarks going into 2025: one for assessing the agency of the models, where LLMs are expected to do tasks, and the other tailored to each use case for models.

Hugging Face is already working on the latter.

The company’s new program, “Your Bench,” aims to help users determine which model to use for a specific task. Users feed a few documents into the program, which then automatically generates a specific benchmark for the type of work that users can apply to different models to see which one is best for the use case.

“Just because these models are all working the same on this academic benchmark doesn’t really mean that they’re all exactly the same,” Wolf said.

Open‑source’s ‘ChatGPT moment’

Founded by Wolf, Clément Delangue, and Julien Chaumond in 2016, Hugging Face has long been a champion of open‑source AI.

Often referred to as the GitHub of machine learning, the company provides an open‑source platform that enables developers, researchers, and enterprises to build, share, and deploy machine‑learning models, datasets, and applications at scale. Users can also browse models and datasets that others have uploaded.

Wolfe told the Brainstorm AI audience that Hugging Face’s “business model is really aligned with open source” and the company’s “goal is to have the maximum number of people participating in this kind of open community and sharing models.”

Wolfe predicted that open‑source AI would continue to thrive, especially after the success of DeepSeek earlier this year.

After its launch late last year, the Chinese‑made AI model DeepSeek R1 sent shockwaves through the AI world when testers found that it matched or even outperformed American closed‑source AI models.

Wolf said DeepSeek was a “ChatGPT moment” for open‑source AI.

“Just like ChatGPT was the moment the whole world discovered AI, DeepSeek was the moment the whole world discovered there was kind of this open society,” he said.

The Fortune 500 Innovation Forum will convene Fortune 500 executives, U.S. policy officials, top founders, and thought leaders to help define what’s next for the American economy, Nov. 16-17 in Detroit. Apply here.
About the Author
By Beatrice NolanTech Reporter
Twitter icon

Beatrice Nolan is a tech reporter on Fortune’s AI team, covering artificial intelligence and emerging technologies and their impact on work, industry, and culture. She's based in Fortune's London office and holds a bachelor’s degree in English from the University of York. You can reach her securely via Signal at beatricenolan.08

See full bioRight Arrow Button Icon
Add Fortune on Google for similar content.

Latest from our Conferences

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • World's Most Admired Companies
  • See All Rankings
  • Lists Calendar
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest from our Conferences

At Fortune Brainstorm Tech 2026, Chris Bedi, Chief Customer Officer and Enterprise AI Advisor, ServiceNow; China Widener, Vice Chair and US Technology, Media & Telecommunications Industry Leader, Deloitte; and Phil Wiser, Chief Technology Officer, Paramount, speak on a panel with Kristin Stoller, Fortune editorial director.
NewslettersFortune Workplace Innovation
This tech CEO fired 80% of his workforce over AI resistance. Here’s what he’s learned since then
By Kristin StollerJune 15, 2026
16 days ago
Courtney Robinson, head of policy and communications, at Akoya speaks on a panel at Fortune Brainstorm Tech 2026.
RetailBrainstorm Tech
AI shopping agents are coming. No one is ready for them
By Jeremy KahnJune 12, 2026
19 days ago
The head of Claude Code hasn’t ‘written a line of code by hand’ in 8 months
ConferencesBrainstorm Tech
The head of Claude Code hasn’t ‘written a line of code by hand’ in 8 months
By Nick LichtenbergJune 11, 2026
20 days ago
Sarah Franklin, Chief Executive Officer of Lattice, and Francine Katsoudas, EVP and Chief People, Policy and Purpose Officer at Cisco, speak at Fortune's COO Summit with Kristin Stoller, Editorial Director at Fortune.
NewslettersFortune Workplace Innovation
AI disruption arrived 6 years early—now executives are drawing the line
By Kristin StollerJune 8, 2026
23 days ago
Fortune Brainstorm Tech 2026 livestream
ConferencesBrainstorm Tech
Fortune Brainstorm Tech 2026 livestream
By Fortune EditorsJune 8, 2026
23 days ago
dw
ConferencesCOO Summit
This CEO has had 6 major jobs in Silicon Valley: How Dennis Woodside built a career on saying yes to hard problems
By Nick LichtenbergJune 3, 2026
28 days ago

Most Popular

MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year
Success
MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year
By Sydney LakeJune 25, 2026
6 days ago
Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster
Success
Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster
By Preston ForeJune 27, 2026
4 days ago
Elon Musk on MacKenzie Scott giving away $26 billion of her fortune: 'Sadly,' it makes the world a worse place
Success
Elon Musk on MacKenzie Scott giving away $26 billion of her fortune: 'Sadly,' it makes the world a worse place
By Sydney LakeJune 29, 2026
2 days ago
The U.S. Army is opening military bases to private billions — here's why that changes everything for the next 250 years
Commentary
The U.S. Army is opening military bases to private billions — here's why that changes everything for the next 250 years
By Marc AndersenJune 30, 2026
1 day ago
Current price of oil as of June 30 2026
Personal Finance
Current price of oil as of June 30 2026
By Joseph HostetlerJune 30, 2026
1 day ago
As Big Tech showers employees with perks to win the talent war, Nvidia built a nearly $5 trillion company by making people pay for their own lunch
Big Tech
As Big Tech showers employees with perks to win the talent war, Nvidia built a nearly $5 trillion company by making people pay for their own lunch
By Marco Quiroz-GutierrezJuly 1, 2026
8 hours ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.