• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
TechAI

Google researchers claim new breakthrough in getting AI to solve tough high school math problems

Jeremy Kahn
By
Jeremy Kahn
Jeremy Kahn
Editor, AI
Down Arrow Button Icon
Jeremy Kahn
By
Jeremy Kahn
Jeremy Kahn
Editor, AI
Down Arrow Button Icon
July 25, 2024, 11:30 AM ET
A man standing at a blackboard filled with equations and diagrams writing with chalk on the board.
Google researchers have just announced a breakthrough in creating an AI system that can solve tough math problems.Photo illustration by Getty Images

Google DeepMind says it has achieved a breakthrough in building an AI system that can handle complex mathematical problems.

Recommended Video

The research division, which is part of Alphabet-owned Google, announced Thursday that it has created a software system that combines multiple AI models and can score well-enough on the International Mathematical Olympiad (IMO), a global test of high school students’ mathematical talents, to be in the top quartile of contestants taking the test. This would be good enough to obtain a silver medal in the competition.

A boastworthy milestone in the machine-versus-mathlete contest, the feat also opens new possibilities for combining different approaches to AI in order to create more capable hybrid AI systems—something that Google said could eventually make its way into commercial products like its line-up of Gemini AI tools.

The news was an advancement on a system that the AI research lab had unveiled in January, called AlphaGeometry, that could solve geometry problems from the IMO about as well as top high school students. The new system—which combines a new model called AlphaProof and an updated and improved AlphaGeometry 2—can tackle all kinds of different mathematical problems and develop sophisticated answers.

The new system managed to answer the most difficult IMO question, which only five of the 609 human contestants in last week’s competition managed to solve. That said, the new system is not perfect—on two of the six problems in the IMO, the new system did not manage to find a solution, and on one problem it took the system three days to reach the correct answer. Human competitors have to solve three questions in four-and-a-half hours, so can average no more than 90 minutes per question.

Google DeepMind researchers said the new system is a step towards more powerful AI models that will be able to plan and reason about complex tasks, although they cautioned that the method would work best in situations where there was a clear way to determine if an output was a valid. This is the case, for example, in software coding where the code will only compile and run if it is valid. David Silver, one of the Google DeepMind researchers who worked on the new system, said it might also work in areas where humans could provide unambiguous feedback about whether the solution the AI produced was a good one.

Google DeepMind said it would incorporate insights from the new system into future versions of its Gemini AI models, although they did not say exactly how this would be done or how soon Gemini might see these upgraded mathematical abilities.

Silver acknowledged that in many real world situations the validity of an answer is highly subjective, or the soundness of a solution can only be determined after a long time period. He said this would make it harder to apply the methods Google DeepMind used for the IMO problems to successfully take on these kind of real world problems.

AlphaZero to the rescue

Unlike other well-known AI models that consist of a single large neural network—a kind of AI software loosely based on the human brain—the AlphaProof system involves multiple neural networks, each performing different functions.

A large language model (LLM)—in this case Google’s Gemini model—is used as one part of the process. But the LLM does not itself do the mathematical reasoning. LLMs which underpin popular AI chatbots—such as Gemini, OpenAI’s ChatGPT, Anthropic’s Claude, and Meta’s AI chatbot—have struggled with solving math problems unless given access to outside tools, such as calculators or specialized math software.

Instead, the LLM is fine-tuned to translate text-based mathematical problems into a formal mathematical language. It then passes the problem to a different AI model, Google DeepMind’s AlphaZero, which was developed in 2017 and originally used to learn to play the strategy board games chess, go, and shogi at superhuman levels. But it turns out that AlphaZero can be used to puzzle out all kinds of problems in systems with clear rules and an easy way of keeping score.

In this case, the AlphaZero component is trained to suggest proof steps to the problem in Lean, a mathematical programming language. If the proof step is valid, it will compile correctly in Lean. If it isn’t, it won’t. This provides a reward signal—much like points in a video game—to AlphaZero. In this way, the AlphaZero component of AlphaProof, learns by trial and error to take steps that are more likely to result in valid solutions. AlphaProof was trained on about one million of examples of IMO problems in the weeks leading up to the competition, Google DeepMind said, and it continued to improve while working on the IMO contest problems.

In cases where the problem involved geometry, the problem was given instead to AlphaGeometry 2. AlphaGeometry 2 is also a hybrid system, combing an LLM component with a component that uses symbolic reasoning. The new AlphaGeometry could solve 83% of IMO geometry problems compared to just 53% for its predecessor. In one case, AlphaGeomerty was able to solve a highly complex geometry problem in just 19 seconds, a feat more akin to a flash of inspiration than a brute force approach based on endless trial and error. In another case, the proof AlphaGeometry offered initially confused some mathematicians who examined it, but they determined it was actually an elegant and highly-unusual way of solving the problem.

Impact on human mathematicians

Pushmeet Kohli, who heads Google DeepMind’s AI for science division, said he saw AlphaProof and AlphaGeometry 2 primarily as tools for helping mathematicians in their work. Silver said he did not see these new mathematical AIs challenging the relevancy of academic mathematicians.

But Timothy Gowers, who is a director of research in mathematics at the University of Cambridge and a past winner of the Fields Medal—a prize that is awarded only once every four years to two to four mathematicians under the age of 40 who have contributed the most to the field—reviewed the proofs AlphaProof and AlphaGeometry 2 produced and said he came away impressed. “I could recognize familiar-looking arguments that had come out of the system,” he said.

He also said that some of the problems required him, as a human mathematician, “to dig quite deep” and come up with what he called “a sort of magic key” that suddenly turns a problem that looks unsolvable into one that is imminently solvable. Gowers said he was surprised that the system had discovered a few of these magic keys because his intuition is that they would be difficult to stumble upon by naïve trial-and-error without any underlying understanding of the mathematical principles involved. But he said he reserved judgment as to whether this meant AlphaProof had actually developed something akin to mathematical intuition. He said more research would be needed to understand more about exactly how the system managed to puzzle out answers to the IMO problems.

Gowers noted that IMO problems were much simpler than what research mathematicians work on. But, compared to Kohli and Silver, Gowers was far less sanguine about what future would hold if AI models kept improving at the current clip. “I actually think that when computers become really good at finding extremely hard proofs, that’s more or less game over for mathematical research,” he said. “I’m not trying to suggest that we’re all that close to that at the moment, but I just, I’m thinking a long way ahead, but how long ahead that really is, is very hard to say.”

Correction, July 25: An earlier version of this story misidentified the name of Google DeepMind’s new AI system for solving complex mathematical proofs. It is called AlphaProof not AlphaSolver.

Join us at the Fortune Workplace Innovation Summit May 19–20, 2026, in Atlanta. The next era of workplace innovation is here—and the old playbook is being rewritten. At this exclusive, high-energy event, the world’s most innovative leaders will convene to explore how AI, humanity, and strategy converge to redefine, again, the future of work. Register now.
About the Author
Jeremy Kahn
By Jeremy KahnEditor, AI
LinkedIn iconTwitter icon

Jeremy Kahn is the AI editor at Fortune, spearheading the publication's coverage of artificial intelligence. He also co-authors Eye on AI, Fortune’s flagship AI newsletter.

See full bioRight Arrow Button Icon

Latest in Tech

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • World's Most Admired Companies
  • See All Rankings
  • Lists Calendar
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Tech

google
InvestingMarkets
Google shares hit all-time high on blowout earnings, market cap doubles to $4.4 trillion in just a year
By Michael Liedtke and The Associated PressApril 30, 2026
35 minutes ago
AWS
Big TechMarkets
Amazon’s cloud sales are growing the most in 15 quarters. Investors sent the stock down on AI capex fears
By Anne D'Innocenzio and The Associated PressApril 30, 2026
43 minutes ago
AstraZeneca CFO Aradhana Sarin
BankingCFO Daily
How AstraZeneca’s 17,000 AI-certified employees are helping it reach a ‘stretch goal’ of $80 billion in revenue
By Sheryl EstradaApril 30, 2026
2 hours ago
agentic
CommentaryAI agents
Why your data infrastructure — not your AI model — will determine whether Agentic AI scales
By Jeffrey Sonnenfeld, Stephen Henriques, Catherine Dai and Zander JeinthanuttkanontApril 30, 2026
3 hours ago
The startup that wants to give surgeons X-ray vision
NewslettersTerm Sheet
The startup that wants to give surgeons X-ray vision
By Allie GarfinkleApril 30, 2026
3 hours ago
Google Cloud CEO Thomas Kurian at Fortune Brainstorm AI 2025 in San Francisco. (Photo: Stuart Isett/Fortune)
NewslettersFortune Tech
Google Cloud is almost one-fifth of Alphabet’s business
By Andrew NuscaApril 30, 2026
4 hours ago

Most Popular

Apple cofounder Ronald Wayne—whose stake would be worth up to $400 billion had he not sold it in 1976—says that at 91, he has no regrets
Success
Apple cofounder Ronald Wayne—whose stake would be worth up to $400 billion had he not sold it in 1976—says that at 91, he has no regrets
By Preston ForeApril 27, 2026
3 days ago
Jamie Dimon gets candid about national debt: ‘There will be a bond crisis, and then we’ll have to deal with it’
Economy
Jamie Dimon gets candid about national debt: ‘There will be a bond crisis, and then we’ll have to deal with it’
By Eleanor PringleApril 29, 2026
1 day ago
‘They left me no choice’: Powell isn’t going anywhere—blocking Trump from another Fed appointee
Banking
‘They left me no choice’: Powell isn’t going anywhere—blocking Trump from another Fed appointee
By Eva RoytburgApril 29, 2026
20 hours ago
‘The cost of compute is far beyond the costs of the employees’: Nvidia executive says right now AI is more expensive than paying human workers
AI
‘The cost of compute is far beyond the costs of the employees’: Nvidia executive says right now AI is more expensive than paying human workers
By Sasha RogelbergApril 28, 2026
2 days ago
‘Take the money and run’: Johns Hopkins economist Steve Hanke on why the UAE quit OPEC
Energy
‘Take the money and run’: Johns Hopkins economist Steve Hanke on why the UAE quit OPEC
By Shawn TullyApril 29, 2026
1 day ago
Google Cloud revenue is now 18% of Alphabet's business. Is this the beginning of the end of Google's search identity?
Big Tech
Google Cloud revenue is now 18% of Alphabet's business. Is this the beginning of the end of Google's search identity?
By Alexei OreskovicApril 29, 2026
13 hours ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.