• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia

Trendingnow

1

Bolt CEO says he let go of his entire HR team for creating problems that didn’t exist: ‘Those problems disappeared when I let them go’ 

2

Despite a $500 million net worth, Shaq just finished his fourth degree. He warns graduates: 'Your character will take you further than your resume'

3

Meet a 21-year-old community college student who's going to China as the first American woman welder in the trades Olympics

1

Bolt CEO says he let go of his entire HR team for creating problems that didn’t exist: ‘Those problems disappeared when I let them go’ 

2

Despite a $500 million net worth, Shaq just finished his fourth degree. He warns graduates: 'Your character will take you further than your resume'

3

Meet a 21-year-old community college student who's going to China as the first American woman welder in the trades Olympics
TechAI

OpenAI’s deep research can complete 26% of Humanity’s Last Exam—a benchmark for the frontier of human knowledge

By
Greg McKenna
Greg McKenna
News Fellow
Down Arrow Button Icon
By
Greg McKenna
Greg McKenna
News Fellow
Down Arrow Button Icon
February 12, 2025, 1:58 AM ET
Sam Altman holds a microphone and speaks amid a bright multicolor backdrop.
Sam Altman, CEO of OpenAI, whose AI agent has set a new standard of performance on Humanity’s Last Exam.Nathan Laine—Bloomberg/Getty Images

Artificial intelligence may be more than a quarter of the way to surpassing the boundaries of human knowledge. OpenAI’s new autonomous agent, deep research, has stormed past competing models and set a new standard on Humanity’s Last Exam, a global benchmark created to determine when AI can answer questions on any topic better than a world-class expert in the field.

Recommended Video

Deep research successfully completed 26.6% of the recently developed test, which consists of over 3,000 questions across hundreds of subjects ranging from rocket science to analytic philosophy. Powered by OpenAI’s frontier o3 model, the AI agent can synthesize a wide range of information and complete multistep research within five-to-30 minutes, its creators say.

OpenAI’s o1 and DeepSeek’s R1 models, which previously sat atop the leaderboard, could only get through roughly 9% of the exam, meaning OpenAI’s new agent represents a nearly threefold jump in performance. The company said the largest gains appeared on inquiries related to chemistry, humanities and social sciences, and mathematics.

Frank Downing, a director of research at Cathie Wood’s ARK Invest, noted that OpenAI’s new agent also set a new state-of-the-art score on GAIA, a test for AI assistants that poses real-world questions that are conceptually simple for humans, but challenging for most digital agents. The new offering provides deeper research and analysis, he added, compared with a competing product launched by Google in December.

But all those accomplishments could look miniscule, Downing said, if subsequent models from OpenAI and competitors make progress on solving Humanity’s Last Exam at a pace similar to how weaker AI models conquered previous academic benchmarks.  

“Humanity’s Last Exam could be saturated within the next 12 months,” he wrote in a note Monday, “effectively surpassing expert-level technical knowledge and reasoning capability.”

What is Humanity’s Last Exam?

The test is the result of an effort led by Dan Hendrycks, the director of the Center for AI Safety and an advisor for companies such as Scale AI and Elon Musk’s xAI. He previously had created another exam called Massive Multitask Language Understanding, or MMLU, which cutting-edge versions of Anthropic’s Claude, Meta’s Llama, and OpenAI’s Chat GPT have been able to mostly crack as of late last year.

Hendrycks said he was inspired to create Humanity’s Last Exam after a conversation with Musk about existing AI tests being too easy.

“Elon looked at the MMLU questions and said, ‘These are undergrad level. I want things that a world-class expert could do,’” Hendrycks told the New York Times in January.

So Hendrycks, with support from Scale AI, spearheaded a project designed to serve as “the final closed-ended academic benchmark of its kind with broad subject coverage.” His team compiled questions submitted by hundreds of college professors, prize-winning mathematicians, and other experts in their fields.

“[The exam] emphasizes world-class mathematics problems aimed at testing deep reasoning skills broadly applicable across multiple academic areas,” the team wrote in a paper debuting the test in January.

Once models start scoring over 50%, Hendrycks said, it’s safe to say humans have met their match in this regard. After that, the clock is presumably ticking until the world witnesses what is termed artificial general intelligence, or the ability of a machine to possess all the cognitive abilities of humans. OpenAI says it envisions this technology, commonly dubbed AGI, as being capable of producing novel scientific research.

“We are now confident we know how to build AGI as we have traditionally understood it,” OpenAI CEO Sam Altman said in a blog post in January.

On Sunday, Google DeepMind CEO Demis Hassabis said it could arrive in just five years.

“And I think society needs to get ready for that and what implications that will have,” he said in Paris on Sunday ahead of the AI Action Summit hosted by the city, CNBC reported.

On that front, time seems to be of the essence.

Join our exclusive webinar on May 28, featuring tech leaders from Orange, Mars, Reckitt, and Saint-Gobain. Apply to attend and receive Fortune’s editorial takeaways.
About the Author
By Greg McKennaNews Fellow
LinkedIn icon

Greg McKenna is a news fellow at Fortune.

See full bioRight Arrow Button Icon

Latest in Tech

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • World's Most Admired Companies
  • See All Rankings
  • Lists Calendar
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Tech

Samuel Corum/Getty Images
Big TechSpaceX
Elon Musk’s proposed pay package in SpaceX’s IPO filing reveals what the company actually is: a $1 trillion monster built to colonize Mars
By Eva RoytburgMay 20, 2026
6 hours ago
elon
SuccessIPOs
SpaceX IPO targets $28.5 trillion total addressable market, mission to ‘make life multiplanetary’ and understand ‘true nature of the universe’
By Nick LichtenbergMay 20, 2026
8 hours ago
Jensen Huang, chief executive officer of Nvidia
AINvidia
Nvidia tells skeptical investors that AI is ready to go mainstream
By Ian King and BloombergMay 20, 2026
9 hours ago
SpaceX finally files IPO prospectus, reveals revenue is up–but losses are too
Big TechSpaceX
SpaceX finally files IPO prospectus, reveals revenue is up–but losses are too
By Allie Garfinkle and Alexei OreskovicMay 20, 2026
9 hours ago
Elon Musk sits with his fists together, looking up.
Commentaryspace
SpaceX will be worth trillions, but the space station that made it possible is worth even more — if we don’t squander it
By Tejpaul BhatiaMay 20, 2026
9 hours ago
Antler CEO Magnus Grimeland says Silicon Valley doesn’t have a monopoly on tech: ‘People can innovate from almost anywhere’
AsiaAsia Agenda
Antler CEO Magnus Grimeland says Silicon Valley doesn’t have a monopoly on tech: ‘People can innovate from almost anywhere’
By Angelica AngMay 20, 2026
9 hours ago

Most Popular

Bolt CEO says he let go of his entire HR team for creating problems that didn’t exist: ‘Those problems disappeared when I let them go’ 
Workplace Culture
Bolt CEO says he let go of his entire HR team for creating problems that didn’t exist: ‘Those problems disappeared when I let them go’ 
By Preston ForeMay 19, 2026
1 day ago
Despite a $500 million net worth, Shaq just finished his fourth degree. He warns graduates: 'Your character will take you further than your resume'
Success
Despite a $500 million net worth, Shaq just finished his fourth degree. He warns graduates: 'Your character will take you further than your resume'
By Preston ForeMay 20, 2026
15 hours ago
Meet a 21-year-old community college student who's going to China as the first American woman welder in the trades Olympics
Future of Work
Meet a 21-year-old community college student who's going to China as the first American woman welder in the trades Olympics
By Mike Householder and The Associated PressMay 17, 2026
4 days ago
The Bezos family just donated $100 million to help achieve one of Mayor Zohran Mamdani’s top campaign promises
Politics
The Bezos family just donated $100 million to help achieve one of Mayor Zohran Mamdani’s top campaign promises
By Jake AngeloMay 12, 2026
8 days ago
Dr. Bernice King on why companies that walked back DEI were never truly committed: 'If you retreat that quick…that reveals who you really are'
Workplace Culture
Dr. Bernice King on why companies that walked back DEI were never truly committed: 'If you retreat that quick…that reveals who you really are'
By Preston ForeMay 19, 2026
1 day ago
Current price of oil as of May 20, 2026
Personal Finance
Current price of oil as of May 20, 2026
By Joseph HostetlerMay 20, 2026
17 hours ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.