• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia

Trendingnow

1

Elon Musk on MacKenzie Scott giving away $26 billion of her fortune: 'Sadly,' it makes the world a worse place

2

MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year

3

Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster

1

Elon Musk on MacKenzie Scott giving away $26 billion of her fortune: 'Sadly,' it makes the world a worse place

2

MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year

3

Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster
TechAI

OpenAI’s deep research can complete 26% of Humanity’s Last Exam—a benchmark for the frontier of human knowledge

By
Greg McKenna
Greg McKenna
News Fellow
Down Arrow Button Icon
By
Greg McKenna
Greg McKenna
News Fellow
Down Arrow Button Icon
February 12, 2025, 1:58 AM ET
Sam Altman holds a microphone and speaks amid a bright multicolor backdrop.
Sam Altman, CEO of OpenAI, whose AI agent has set a new standard of performance on Humanity’s Last Exam.Nathan Laine—Bloomberg/Getty Images
Add Fortune on Google for similar content.

Artificial intelligence may be more than a quarter of the way to surpassing the boundaries of human knowledge. OpenAI’s new autonomous agent, deep research, has stormed past competing models and set a new standard on Humanity’s Last Exam, a global benchmark created to determine when AI can answer questions on any topic better than a world-class expert in the field.

Recommended Video

Deep research successfully completed 26.6% of the recently developed test, which consists of over 3,000 questions across hundreds of subjects ranging from rocket science to analytic philosophy. Powered by OpenAI’s frontier o3 model, the AI agent can synthesize a wide range of information and complete multistep research within five-to-30 minutes, its creators say.

OpenAI’s o1 and DeepSeek’s R1 models, which previously sat atop the leaderboard, could only get through roughly 9% of the exam, meaning OpenAI’s new agent represents a nearly threefold jump in performance. The company said the largest gains appeared on inquiries related to chemistry, humanities and social sciences, and mathematics.

Frank Downing, a director of research at Cathie Wood’s ARK Invest, noted that OpenAI’s new agent also set a new state-of-the-art score on GAIA, a test for AI assistants that poses real-world questions that are conceptually simple for humans, but challenging for most digital agents. The new offering provides deeper research and analysis, he added, compared with a competing product launched by Google in December.

But all those accomplishments could look miniscule, Downing said, if subsequent models from OpenAI and competitors make progress on solving Humanity’s Last Exam at a pace similar to how weaker AI models conquered previous academic benchmarks.  

“Humanity’s Last Exam could be saturated within the next 12 months,” he wrote in a note Monday, “effectively surpassing expert-level technical knowledge and reasoning capability.”

What is Humanity’s Last Exam?

The test is the result of an effort led by Dan Hendrycks, the director of the Center for AI Safety and an advisor for companies such as Scale AI and Elon Musk’s xAI. He previously had created another exam called Massive Multitask Language Understanding, or MMLU, which cutting-edge versions of Anthropic’s Claude, Meta’s Llama, and OpenAI’s Chat GPT have been able to mostly crack as of late last year.

Hendrycks said he was inspired to create Humanity’s Last Exam after a conversation with Musk about existing AI tests being too easy.

“Elon looked at the MMLU questions and said, ‘These are undergrad level. I want things that a world-class expert could do,’” Hendrycks told the New York Times in January.

So Hendrycks, with support from Scale AI, spearheaded a project designed to serve as “the final closed-ended academic benchmark of its kind with broad subject coverage.” His team compiled questions submitted by hundreds of college professors, prize-winning mathematicians, and other experts in their fields.

“[The exam] emphasizes world-class mathematics problems aimed at testing deep reasoning skills broadly applicable across multiple academic areas,” the team wrote in a paper debuting the test in January.

Once models start scoring over 50%, Hendrycks said, it’s safe to say humans have met their match in this regard. After that, the clock is presumably ticking until the world witnesses what is termed artificial general intelligence, or the ability of a machine to possess all the cognitive abilities of humans. OpenAI says it envisions this technology, commonly dubbed AGI, as being capable of producing novel scientific research.

“We are now confident we know how to build AGI as we have traditionally understood it,” OpenAI CEO Sam Altman said in a blog post in January.

On Sunday, Google DeepMind CEO Demis Hassabis said it could arrive in just five years.

“And I think society needs to get ready for that and what implications that will have,” he said in Paris on Sunday ahead of the AI Action Summit hosted by the city, CNBC reported.

On that front, time seems to be of the essence.

About the Author
By Greg McKennaNews Fellow
LinkedIn icon

Greg McKenna is a news fellow at Fortune.

See full bioRight Arrow Button Icon
Add Fortune on Google for similar content.

Latest in Tech

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • World's Most Admired Companies
  • See All Rankings
  • Lists Calendar
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Tech

Anthropic CEO Dario Amodei pointing to his head.
AIAnthropic
At the heart of Anthropic’s clashes with the U.S. government, a decision not to play by the new rules of Trump’s Washington
By Jeremy KahnJune 30, 2026
2 hours ago
wb
CommentaryLeadership
I grew BDO from $600 million to $3.4 billion. Here’s the 3-part formula that made it possible
By Wayne BersonJune 30, 2026
3 hours ago
vinod
CommentaryData centers
Vinod Khosla: AI’s energy crisis has a fix — and it doesn’t need the grid
By Vinod KhoslaJune 30, 2026
3 hours ago
Jamie Dimon isn’t giving up the top job. That’s turned JPMorgan into a poaching ground for CEO talent
C-SuiteNext to Lead
Jamie Dimon isn’t giving up the top job. That’s turned JPMorgan into a poaching ground for CEO talent
By Ruth UmohJune 30, 2026
3 hours ago
Comcast’s split brings former CFO Michael Angelakis back as CEO
AICFO Daily
Comcast’s split brings former CFO Michael Angelakis back as CEO
By Sheryl EstradaJune 30, 2026
4 hours ago
marc
Commentary250 Years of Innovation
The U.S. Army is opening military bases to private billions — here’s why that changes everything for the next 250 years
By Marc AndersenJune 30, 2026
4 hours ago

Most Popular

Elon Musk on MacKenzie Scott giving away $26 billion of her fortune: 'Sadly,' it makes the world a worse place
Success
Elon Musk on MacKenzie Scott giving away $26 billion of her fortune: 'Sadly,' it makes the world a worse place
By Sydney LakeJune 29, 2026
1 day ago
MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year
Success
MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year
By Sydney LakeJune 25, 2026
5 days ago
Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster
Success
Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster
By Preston ForeJune 27, 2026
3 days ago
The retired college professor fighting a $313 trespassing ticket in Wisconsin thinks he's part of a national struggle
Environment
The retired college professor fighting a $313 trespassing ticket in Wisconsin thinks he's part of a national struggle
By Catherina GioinoJune 28, 2026
2 days ago
'Humanity has chosen to become idiots': This Brown professor switched to take-home exams after a mass shooting and discovered mass cheating
AI
'Humanity has chosen to become idiots': This Brown professor switched to take-home exams after a mass shooting and discovered mass cheating
By Catherina GioinoJune 29, 2026
17 hours ago
Current price of oil as of June 29, 2026
Personal Finance
Current price of oil as of June 29, 2026
By Joseph HostetlerJune 29, 2026
1 day ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.