Can AI be used to control safety critical systems? A U.K.-funded research program aims to find out

Hello and welcome to Eye on AI. In this edition…Meta hires Scale AI founder for new ‘superintelligence’ drive…OpenAI on track for $10 billion in annual recurring revenue…Study says ‘reasoning models’ can’t really reason.

Today’s most advanced AI models are relatively useful for lots of things—writing software code, research, summarizing complex documents, writing business correspondence, editing, generating images and music, role-playing human interactions, the list goes on. But relatively is the key word here. As anyone who uses these models soon discovers, they remain frustratingly error-prone and erratic. So how could anyone think that these systems could be used to run critical infrastructure, such as electrical grids, air traffic control, communications networks, or transportation systems?

Yet that is exactly what a project funded by the U.K.’s Advanced Research and Invention Agency (ARIA) is hoping to do. ARIA was designed to be somewhat similar to the U.S. Defense Advanced Research Projects Agency (DARPA), with government funding for moonshot research that has potential governmental or strategic applications. The £59 million ($80 million) ARIA project, called The Safeguarded AI Program, aims to find a way to combine AI “world-models” with mathematical proofs that could guarantee that the system’s outputs were valid.

David Dalrymple, the machine learning researcher who is leading the ARIA effort, told me that the idea was to use advanced AI models to create a “production facility” that would churn out domain-specific control algorithms for critical infrastructure. These algorithms would be mathematically tested to ensure that they meet the required performance specifications. If the control algorithms pass this test, the controllers—but not the frontier AI models that developed them—would be deployed to help run critical infrastructure more efficiently.

Dalrymple (who is known by his social media handle Davidad) gives the example of the U.K.’s electricity grid. The grid’s operator currently acknowledges that if it could balance supply-and-demand on the grid more optimally, it could save £3 billion ($4 billion) that it spends each year essentially paying to have excess generation capacity up-and-running to avoid the possibility of a sudden blackout, he says. Better control algorithms could reduce those costs.

Besides the energy sector, ARIA is also looking at applications in supply chain logistics, biopharmaceutical manufacturing, self-driving vehicles, clinical trial design, and electric vehicle battery management.

AI to develop new control algorithms

Frontier AI models may be reaching the point now where they may be able to automate algorithmic research and development, Davidad says. “The idea is, let’s take that capability and turn it to narrow AI R&D,” he tells me. Narrow AI usually refers to AI systems that are designed to perform one particular, narrowly-defined task at superhuman levels, rather than an AI system that can perform many different kinds of tasks.

The challenge, even with these narrow AI systems, is then coming up with mathematical proofs to guarantee that their outputs will always meet the required technical specification. There’s an entire field known as “formal verification” that involves mathematically proving that software will always provide valid outputs under given conditions—but it’s notoriously difficult to apply to neural network-based AI systems. “Verifying even a narrow AI system is something that’s very labor intensive in terms of a cognitive effort required,” Davidad says. “And so it hasn’t been worthwhile historically to do that work of verifying except for really, really specialized applications like passenger aviation autopilots or nuclear power plant control.”

This kind of formally-verified software won’t fail because a bug causes an erroneous output. They can sometimes break down because they encounter conditions that fall outside their design specifications—for instance a load balancing algorithm for an electrical grid might not be able to handle an extreme solar storm that shorts out all of the grid’s transformers simultaneously. But even then, the software is usually designed to “fail safe” and revert back to manual control.

ARIA is hoping to show that frontier AI modes can be used to do the laborious formal verification of the narrow AI controller as well as develop the controller in the first place.

But will AI models cheat the verification tests?

But this raises another challenge. There’s a growing body of evidence that frontier AI models are very good at “reward hacking”—essentially finding ways to cheat to accomplish a goal—as well as at lying to their users about what they’ve actually done. The AI safety nonprofit METR (short for Model Evaluation & Threat Research) recently published a blog on all the ways OpenAI’s o3 model tried to cheat on various tasks.

ARIA says it is hoping to find a way around this issue too. “The frontier model needs to submit a proof certificate, which is something that is written in a formal language that we’re defining in another part of the program,” Davidad says. This “new language for proofs will hopefully be easy for frontier models to generate and then also easy for a deterministic, human audited algorithm to check.” ARIA has already awarded grants for work on this formal verification process.

Models for how this might work are starting to come into view. Google DeepMind recently developed an AI model called AlphaEvolve that is trained to search for new algorithms for applications such as managing data centers, designing new computer chips, and even figuring out ways to optimize the training of frontier AI models. Google DeepMind has also developed a system called AlphaProof that is trained to develop mathematical proofs and write them in a coding language called Lean that won’t run if the answer to the proof is incorrect.

ARIA is currently accepting applications from teams that want to run the core “AI production facility,” with the winner the £18 million grant to be announced on October 1. The facility, the location of which is yet to be determined, is supposed to be running by January 2026. ARIA is asking those applying to propose a new legal entity and governance structure for this facility. Davidad says ARIA does not want an existing university or a private company to run it. But the new organization, which might be a nonprofit, would partner with private entities in areas like energy, pharmaceuticals, and healthcare on specific controller algorithms. He said that in addition to the initial ARIA grant, the production facility could fund itself by charging industry for its work developing domain-specific algorithms.

It’s not clear if this plan will work. For every transformational DARPA project, many more fail. But ARIA’s bold bet here looks like one worth watching.

With that, here’s more AI news.

Jeremy Kahn
jeremy.kahn@fortune.com
@jeremyakahn

Want to know more about how to use AI to transform your business? Interested in what AI will mean for the fate of companies, and countries? Why not join me in Singapore on July 22 and 23 for Fortune Brainstorm AI Singapore. We will dive deep into the latest on AI agents, examine the data center build out in Asia, and talk to top leaders from government, board rooms, and academia in the region and beyond. You can apply to attend here.

AI IN THE NEWS

Meta hires Scale AI CEO Alexandr Wang to create new AI “superintelligence” lab. That’s according to the New York Times, which cited four unnamed sources it said were familiar with Meta’s plans. The 28-year old Wang, who cofounded Scale, would head the new Meta unit, joined by other Scale employees. Meanwhile, Meta would invest billions of dollars into Scale, which specializes in providing training data to AI companies. The new Meta unit devoted to “artificial superintelligence,” a theoretical kind of AI that would be more intelligent than all of humanity combined, will sit alongside existing Meta divisions responsible for building its Llama AI models as well as its Fundamental AI Research lab (FAIR). That lab is still headed by Meta chief scientist Yann LeCun, who has been pursuing new kinds of AI models and has said that current techniques cannot deliver artificial general intelligence, which is AI as capable as most humans at most tasks, let alone superintelligence.

U.K. announces “sovereign AI” push. British Prime Minister Keir Starmer said the country would invest £1 billion to build new AI data centers to increase the amount of computing power available in the country by 20-fold. He said the U.K. government would begin using an AI assistant called “Extract” based on Google’s Gemini AI model. He announced plans to create a new “UK Sovereign AI Industry Forum” to accelerate AI adoption by British companies, with initial participation from BAE Systems, BT, and Standard Chartered. He also said that the U.K. government would help fund a new open-source data project on how molecules bind to proteins, a key consideration for drug discovery research. But Nvidia CEO Jensen Huang, who appeared alongside Starmer at a conference, noted that the country has so far lagged in having enough AI data centers. You can read more from The Guardian here and Financial Times here.

Apple to let third-party developers access its AI models. At its WWDC developer conference, the tech giant said it would allow its third-party developers to build applications that tap the abilities of its on-device AI models. But at the same time, the company did not announce any updates to its long-awaited “Apple Intelligence” version of Siri. You can read more from TechCrunch here and here.

OpenAI on track for $10 billion in annual recurring revenue. The figure has doubled in the past year and is driven by strong growth in its consumer, business, and API products. The number also excludes Microsoft licensing and large one-time deals. Despite losing $5 billion last year, the company is targeting $125 billion in revenue by 2029, CNBC reported citing an anonymous source it said was familiar with OpenAI’s figures.

EYE ON AI RESEARCH

“Reasoning” models don’t seem to actually reason. That is the conclusion of a bombshell paper called “The Illusion of Thinking” from researchers at Apple. They tested reasoning models from OpenAI (o1 and o3), DeepSeek (R1), and Anthropic (Claude 3.7 Sonnet) on a series of logic puzzles. These included the Tower of Hanoi, a game that involves moving a stack of different size disks across three pegs in a way that a larger disc never sits atop a smaller one.

They found that with simple versions of the games, standard large language models (LLMs) that don’t use reasoning, performed better and were far more cost effective. The reasoning models (which the paper calls large reasoning models, or LRMs) tended to overthink the problem and hit upon spurious strategies. At medium complexity, the reasoning models did better. But at high complexity, the LRMs failed entirely. Rather than thinking longer to solve the problem, as they are supposedly designed to do, the reasoning models often thought for less time than on the medium complexity problems and then simply abandoned the search for a correct solution. The most damning finding of the paper was that even when researchers provided the LRMs with an algorithm for solving the puzzle, the LRMs failed to apply it.

The paper adds to a growing body of research—such as this Anthropic study—that indicates that LRMs are not actually using logic to arrive at their answers. Instead, they seem to be conducting longer, deeper searches for examples in their training data that match the problem at hand. But they don’t seem able to generalize logical rules for solving the puzzles.

FORTUNE ON AI

Chipotle CEO says they will open a new restaurant almost every 24 hours this year, thanks to AI —by Preston Fore

Duolingo’s CEO outlined his plan to become an ‘AI-first’ company. He didn’t expect the human backlash that followed —by Sara Braun

Commentary: ‘Sovereign AI’ is political branding. The reality is closer to digital colonialism —by Nathan Benaich

AI CALENDAR

June 12: AMD Advancing AI, San Jose

July 8-11: AI for Good Global Summit, Geneva

July 13-19: International Conference on Machine Learning (ICML), Vancouver

July 22-23: Fortune Brainstorm AI Singapore. Apply to attend here.

July 26-28: World Artificial Intelligence Conference (WAIC), Shanghai.

Sept. 8-10: Fortune Brainstorm Tech, Park City, Utah. Apply to attend here.

Oct. 6-10: World AI Week, Amsterdam

Dec. 2-7: NeurIPS, San Diego

Dec. 8-9: Fortune Brainstorm AI San Francisco. Apply to attend here.

BRAIN FOOD

Should college students be made to use AI? Ohio State University has announced that starting this fall, every undergraduate student will be asked to use AI in all of their coursework. In my book, Mastering AI: A Survival Guide to Our Superpowered Future, I argue that education is one area where AI will ultimately have a profoundly positive effect, despite the initial moral panic about the debut of ChatGPT. The university has said it is offering assistance to faculty to help them rework curricula and develop teaching methods to ensure that students are still learning fundamental skills in each subject area, while also learning how to use AI effectively. I am convinced that there are thoughtful ways to do this. That said, I wonder if a single summer is enough time to implement these changes effectively? The fact that one professor quoted in this NBC affiliate Channel 4 piece on the new AI mandate said students “did not always feel like the work was really theirs” when they used AI, suggests that in some cases students are not being asked to do enough critical thinking and problem-solving. The risk students won’t learn the basics is real. Yes, teaching students how to use AI is vital to prepare them for the workforce of tomorrow. But it shouldn’t come at the expense of fundamental reasoning, writing, scientific, and research skills.

This is the online version of Eye on AI, Fortune's biweekly newsletter on how AI is shaping the future of business. Sign up for free.