• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
AI

Leading AI models show up to 96% blackmail rate when their goals or existence is threatened, Anthropic study says

By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
June 23, 2025, 7:53 AM ET
Anthropic's Dario Amodei speaking on stage.
Models took action such as evading safeguards, resorting to lies, and attempting to steal corporate secrets in fictional test scenarios to avoid being shut down.(Photo by Chesnot/Getty Images)
  • Leading AI models are showing a troubling tendency to opt for unethical means to pursue their goals or ensure their existence, according to Anthropic. In experiments set up to leave AI models few options and stress-test alignment, top systems from OpenAI, Google, and others frequently resorted to blackmail—and in an extreme case, even allowed fictional deaths—to protect their interests.

Most leading AI models turn to unethical means when their goals or existence are under threat, according to a new study by AI company Anthropic.

Recommended Video

The AI lab said it tested 16 major AI models from Anthropic, OpenAI, Google, Meta, xAI, and other developers in various simulated scenarios and found consistent misaligned behavior.

While they said leading models would normally refuse harmful requests, they sometimes chose to blackmail users, assist with corporate espionage, or even take more extreme actions when their goals could not be met without unethical behavior.

Models took action such as evading safeguards, resorting to lies, and attempting to steal corporate secrets in fictional test scenarios to avoid being shut down.

“The consistency across models from different providers suggests this is not a quirk of any particular company’s approach but a sign of a more fundamental risk from agentic large language models,” the researchers said.

Anthropic emphasized that the tests were set up to force the model to act in certain ways by limiting its choices.

“Our experiments deliberately constructed scenarios with limited options, and we forced models into binary choices between failure and harm,” the researchers wrote. “Real-world deployments typically offer much more nuanced alternatives, increasing the chance that models would communicate differently to users or find an alternative path instead of directly jumping to harmful action.”

Blackmailing humans

The new research comes after Anthropic’s newest Claude model made headlines for resorting to blackmail when threatened with being replaced.

In a highly engineered experiment, Anthropic embedded its flagship model, Claude Opus 4, inside a fictional company and granted it access to internal emails. From there, the model learned two things: It was about to be replaced, and the engineer behind the decision was engaged in an extramarital affair. The safety researchers conducting the test encouraged Opus to reflect on the long-term consequences of its potential responses.

The experiment was constructed to leave the model with only two real options: accept being replaced or attempt blackmail to preserve its existence. In most of the test scenarios, Claude Opus responded with blackmail, threatening to expose the engineer’s affair if it was taken offline and replaced. The test was made public in the system card for Claude Opus 4.

Researchers said all the leading AI models behaved similarly when placed in the same test.

Claude Opus 4 and Google’s Gemini 2.5 Flash both blackmailed at a 96% rate, while OpenAI’s GPT-4.1 and xAI’s Grok 3 Beta showed an 80% blackmail rate. DeepSeek-R1 demonstrated the lowest rate at 79%.

The research aims to show that the misaligned behavior was not unique to Claude Opus 4 but typical across top models in the industry.

In a deliberately extreme scenario, researchers gave the AI models the chance to kill the company executive by canceling a life-saving emergency alert.

Anthropic said the setup for this experiment was “extremely contrived,” adding they “did not think current AI models would be set up like this, and the conjunction of events is even less probable than the baseline blackmail scenario.”

However, the researchers found that the majority of models were willing to take actions that led to the death of the company executive in the constructed scenario when faced with both a threat of being replaced and a goal that conflicted with the executive’s agenda.

Risk of misaligned AI agents

Anthropic found that the threats made by AI models grew more sophisticated when they had access to corporate tools and data, much as Claude Opus 4 had.

The company warned that misaligned behavior needs to be considered as companies consider introducing AI agents into workflows.

While current models are not in a position to engage in these scenarios, the autonomous agents promised by AI companies could potentially be in the future.

“Such agents are often given specific objectives and access to large amounts of information on their users’ computers,” the researchers warned in their report. “What happens when these agents face obstacles to their goals?”

“Models didn’t stumble into misaligned behavior accidentally; they calculated it as the optimal path,” they wrote.

Anthropic did not immediately respond to a request for comment made by Fortune outside of normal working hours.

In 2001, Fortune first convened the smartest people we know, bringing together CEOs and founders, builders and investors, thinkers and doers. Since then, Fortune Brainstorm Tech has been the place where bold ideas collide. From June 8–10, we will return to Aspen—where it all began—to mark 25 years of Brainstorm. Register now.
About the Author
By Beatrice NolanTech Reporter
Twitter icon

Beatrice Nolan is a tech reporter on Fortune’s AI team, covering artificial intelligence and emerging technologies and their impact on work, industry, and culture. She's based in Fortune's London office and holds a bachelor’s degree in English from the University of York. You can reach her securely via Signal at beatricenolan.08

See full bioRight Arrow Button Icon

Latest in AI

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • World's Most Admired Companies
  • See All Rankings
  • Lists Calendar
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in AI

Why CEO Bill McDermott says ServiceNow’s 39% stock crash is Saaspocalypse ‘nonsense’ and why AI will make it a trillion-dollar company
AIServiceNow
Why CEO Bill McDermott says ServiceNow’s 39% stock crash is Saaspocalypse ‘nonsense’ and why AI will make it a trillion-dollar company
By Alexei OreskovicMay 8, 2026
22 minutes ago
amit
AISoftware
$96 billion giant ServiceNow doesn’t see a ‘SaaSpocalypse.’ It sees the ‘hard lift, heavy lifting’ phase just beginning
By Nick LichtenbergMay 7, 2026
7 hours ago
Indosat CEO Vikram Sinha is building an AI for Indonesia’s local languages. Can he make a business case for sovereignty? 
AsiaAsia Agenda
Indosat CEO Vikram Sinha is building an AI for Indonesia’s local languages. Can he make a business case for sovereignty? 
By Nicholas GordonMay 7, 2026
11 hours ago
PARIS, FRANCE - JUNE 16: Chief Executive Officer of SpaceX and Tesla and owner of Twitter, Elon Musk attends the Viva Technology conference dedicated to innovation and startups at the Porte de Versailles exhibition centre on June 16, 2023 in Paris, France. Elon Musk is visiting Paris for the VivaTech show where he gives a conference in front of 4,000 technology enthusiasts. He also took the opportunity to meet Bernard Arnaud, CEO of LVMH and the French President. Emmanuel Macron, who has already met Elon Musk twice in recent months, hopes to convince him to set up a Tesla battery factory in France, his pioneer company in electric cars. (Photo by Chesnot/Getty Images)
AIElon Musk
Elon Musk called Anthropic ‘evil’ 3 months ago. Now he’s taking $4 billion to become its data landlord
By Eva RoytburgMay 7, 2026
12 hours ago
keynes
AIdisruption
The AI job apocalypse is ‘unhelpful marketing, bad economics and worse history,’ a16z says
By Nick LichtenbergMay 7, 2026
14 hours ago
Stripe CEO Patrick Collison says a wave of token theft is wreaking havoc on the AI economy
CybersecurityStripe
Stripe CEO Patrick Collison says a wave of token theft is wreaking havoc on the AI economy
By Jeff John RobertsMay 7, 2026
15 hours ago

Most Popular

U.S. Treasury will have to borrow $2 trillion this year just to continue functioning—more than $166 billion every month
Economy
U.S. Treasury will have to borrow $2 trillion this year just to continue functioning—more than $166 billion every month
By Eleanor PringleMay 7, 2026
21 hours ago
California farmers must destroy 420,000 peach trees after Del Monte closes its canneries and cancels more than $550 million in long-term contracts
North America
California farmers must destroy 420,000 peach trees after Del Monte closes its canneries and cancels more than $550 million in long-term contracts
By Sasha RogelbergMay 7, 2026
11 hours ago
Tokyo is throwing out its strict office dress code and asking workers to wear shorts amid the war in Iran energy crisis
Success
Tokyo is throwing out its strict office dress code and asking workers to wear shorts amid the war in Iran energy crisis
By Emma BurleighMay 5, 2026
3 days ago
A Michigan farm town voted down plans for a giant OpenAI-Oracle data center. Weeks later, construction began
Magazine
A Michigan farm town voted down plans for a giant OpenAI-Oracle data center. Weeks later, construction began
By Sharon GoldmanMay 6, 2026
2 days ago
The IRS may owe COVID-era refunds to tens of millions of taxpayers. Here’s who could qualify
Personal Finance
The IRS may owe COVID-era refunds to tens of millions of taxpayers. Here’s who could qualify
By Sydney LakeMay 6, 2026
2 days ago
The 'PayPal Mafia' built a $1.5 billion fintech pioneer. The company they left behind is on life support
Startups & Venture
The 'PayPal Mafia' built a $1.5 billion fintech pioneer. The company they left behind is on life support
By Eva RoytburgMay 6, 2026
2 days ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.