• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
AI

Leading AI models show up to 96% blackmail rate when their goals or existence is threatened, Anthropic study says

By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
June 23, 2025, 7:53 AM ET
Anthropic's Dario Amodei speaking on stage.
Models took action such as evading safeguards, resorting to lies, and attempting to steal corporate secrets in fictional test scenarios to avoid being shut down.(Photo by Chesnot/Getty Images)
  • Leading AI models are showing a troubling tendency to opt for unethical means to pursue their goals or ensure their existence, according to Anthropic. In experiments set up to leave AI models few options and stress-test alignment, top systems from OpenAI, Google, and others frequently resorted to blackmail—and in an extreme case, even allowed fictional deaths—to protect their interests.

Most leading AI models turn to unethical means when their goals or existence are under threat, according to a new study by AI company Anthropic.

Recommended Video

The AI lab said it tested 16 major AI models from Anthropic, OpenAI, Google, Meta, xAI, and other developers in various simulated scenarios and found consistent misaligned behavior.

While they said leading models would normally refuse harmful requests, they sometimes chose to blackmail users, assist with corporate espionage, or even take more extreme actions when their goals could not be met without unethical behavior.

Models took action such as evading safeguards, resorting to lies, and attempting to steal corporate secrets in fictional test scenarios to avoid being shut down.

“The consistency across models from different providers suggests this is not a quirk of any particular company’s approach but a sign of a more fundamental risk from agentic large language models,” the researchers said.

Anthropic emphasized that the tests were set up to force the model to act in certain ways by limiting its choices.

“Our experiments deliberately constructed scenarios with limited options, and we forced models into binary choices between failure and harm,” the researchers wrote. “Real-world deployments typically offer much more nuanced alternatives, increasing the chance that models would communicate differently to users or find an alternative path instead of directly jumping to harmful action.”

Blackmailing humans

The new research comes after Anthropic’s newest Claude model made headlines for resorting to blackmail when threatened with being replaced.

In a highly engineered experiment, Anthropic embedded its flagship model, Claude Opus 4, inside a fictional company and granted it access to internal emails. From there, the model learned two things: It was about to be replaced, and the engineer behind the decision was engaged in an extramarital affair. The safety researchers conducting the test encouraged Opus to reflect on the long-term consequences of its potential responses.

The experiment was constructed to leave the model with only two real options: accept being replaced or attempt blackmail to preserve its existence. In most of the test scenarios, Claude Opus responded with blackmail, threatening to expose the engineer’s affair if it was taken offline and replaced. The test was made public in the system card for Claude Opus 4.

Researchers said all the leading AI models behaved similarly when placed in the same test.

Claude Opus 4 and Google’s Gemini 2.5 Flash both blackmailed at a 96% rate, while OpenAI’s GPT-4.1 and xAI’s Grok 3 Beta showed an 80% blackmail rate. DeepSeek-R1 demonstrated the lowest rate at 79%.

The research aims to show that the misaligned behavior was not unique to Claude Opus 4 but typical across top models in the industry.

In a deliberately extreme scenario, researchers gave the AI models the chance to kill the company executive by canceling a life-saving emergency alert.

Anthropic said the setup for this experiment was “extremely contrived,” adding they “did not think current AI models would be set up like this, and the conjunction of events is even less probable than the baseline blackmail scenario.”

However, the researchers found that the majority of models were willing to take actions that led to the death of the company executive in the constructed scenario when faced with both a threat of being replaced and a goal that conflicted with the executive’s agenda.

Risk of misaligned AI agents

Anthropic found that the threats made by AI models grew more sophisticated when they had access to corporate tools and data, much as Claude Opus 4 had.

The company warned that misaligned behavior needs to be considered as companies consider introducing AI agents into workflows.

While current models are not in a position to engage in these scenarios, the autonomous agents promised by AI companies could potentially be in the future.

“Such agents are often given specific objectives and access to large amounts of information on their users’ computers,” the researchers warned in their report. “What happens when these agents face obstacles to their goals?”

“Models didn’t stumble into misaligned behavior accidentally; they calculated it as the optimal path,” they wrote.

Anthropic did not immediately respond to a request for comment made by Fortune outside of normal working hours.

In 2001, Fortune first convened “The Smartest People We Know,” bringing together CEOs and founders, builders and investors, thinkers and doers. Since then, Fortune Brainstorm Tech has been the place where bold ideas collide. From June 8–10, we will return to Aspen—where it all began—to mark 25 years of Brainstorm. Register now.
About the Author
By Beatrice NolanTech Reporter
Twitter icon

Beatrice Nolan is a tech reporter on Fortune’s AI team, covering artificial intelligence and emerging technologies and their impact on work, industry, and culture. She's based in Fortune's London office and holds a bachelor’s degree in English from the University of York. You can reach her securely via Signal at beatricenolan.08

See full bioRight Arrow Button Icon

Latest in AI

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in AI

Even Nvidia’s own research teams can’t get enough GPUs amid the race for AI computing power
NewslettersEye on AI
Even Nvidia’s own research teams can’t get enough GPUs amid the race for AI computing power
By Sharon GoldmanApril 9, 2026
7 hours ago
You’re looking at the AI revolution all wrong, top economist says: 40% unemployment and a 3-day work week are the same thing
AIdisruption
You’re looking at the AI revolution all wrong, top economist says: 40% unemployment and a 3-day work week are the same thing
By Nick LichtenbergApril 9, 2026
7 hours ago
Zoom CEO Eric Yuan
Successthe future of work
‘I hate working 5 days’: Zoom CEO says traditional work schedules are becoming obsolete—and predicts a 3-day workweek by 2031
By Preston ForeApril 9, 2026
8 hours ago
lego
PoliticsIran
AI-savvy pro-Iran groups troll America with Lego Movie-style propaganda videos mocking American failure
By Sam McNeil and The Associated PressApril 9, 2026
10 hours ago
data centers
EnergyData centers
Data centers are destroying states’ clean energy dreams
By Jessica Hill and The Associated PressApril 9, 2026
10 hours ago
Photo: A fireball rises from a building hit by an Israeli airstrike in the area of Abbasiyeh, on the outskirts of the southern Lebanese city of Tyre, on April 8, 2026. Lebanon's army warned people against returning to the country's south on April 8, where the Israeli military is still launching attacks, as Israel said the ceasefire with Iran did not include its conflict with Hezbollah. (Photo by Kawnat HAJU / AFP via Getty Images)
PoliticsMarkets
Too much fire, not enough cease: Iran tightens its grip on global oil trade on eve of peace talks
By Jim EdwardsApril 9, 2026
12 hours ago

Most Popular

The U.S. government is spending $88 billion a month in interest on national debt—equal to spending on defense and education combined
Economy
The U.S. government is spending $88 billion a month in interest on national debt—equal to spending on defense and education combined
By Fortune EditorsApril 9, 2026
12 hours ago
The U.S. had a national debt ‘home run’ in its grasp, says Jamie Dimon. But the government did nothing, and now its best option is crisis management
Economy
The U.S. had a national debt ‘home run’ in its grasp, says Jamie Dimon. But the government did nothing, and now its best option is crisis management
By Fortune EditorsApril 8, 2026
2 days ago
2 years ago, Saudi Arabia quietly canceled the ‘petrodollar’ deal with America that wired the world economy for 50 years. Then war broke out in Iran
Energy
2 years ago, Saudi Arabia quietly canceled the ‘petrodollar’ deal with America that wired the world economy for 50 years. Then war broke out in Iran
By Fortune EditorsApril 7, 2026
2 days ago
Self-made billionaire MrBeast says his work-life balance is nonexistent and calls it a ‘miracle’ if he works less than 15-hour days: ‘I live to work’
Success
Self-made billionaire MrBeast says his work-life balance is nonexistent and calls it a ‘miracle’ if he works less than 15-hour days: ‘I live to work’
By Fortune EditorsApril 8, 2026
1 day ago
Gen Z workers are so fearful AI will take their job they’re intentionally sabotaging their company’s AI rollout
AI
Gen Z workers are so fearful AI will take their job they’re intentionally sabotaging their company’s AI rollout
By Fortune EditorsApril 8, 2026
1 day ago
Gen Z doesn't want your full-time job. They want several part-time roles, and it's reshaping the entire workforce
Success
Gen Z doesn't want your full-time job. They want several part-time roles, and it's reshaping the entire workforce
By Fortune EditorsApril 9, 2026
15 hours ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.