• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia

Trendingnow

1

Bolt CEO says he let go of his entire HR team for creating problems that didn’t exist: ‘Those problems disappeared when I let them go’: 

2

The Bezos family just donated $100 million to help achieve one of Mayor Zohran Mamdani’s top campaign promises

3

Current price of oil as of May 19, 2026

1

Bolt CEO says he let go of his entire HR team for creating problems that didn’t exist: ‘Those problems disappeared when I let them go’: 

2

The Bezos family just donated $100 million to help achieve one of Mayor Zohran Mamdani’s top campaign promises

3

Current price of oil as of May 19, 2026
AI

Leading AI models show up to 96% blackmail rate when their goals or existence is threatened, Anthropic study says

By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
June 23, 2025, 7:53 AM ET
Anthropic's Dario Amodei speaking on stage.
Models took action such as evading safeguards, resorting to lies, and attempting to steal corporate secrets in fictional test scenarios to avoid being shut down.(Photo by Chesnot/Getty Images)
  • Leading AI models are showing a troubling tendency to opt for unethical means to pursue their goals or ensure their existence, according to Anthropic. In experiments set up to leave AI models few options and stress-test alignment, top systems from OpenAI, Google, and others frequently resorted to blackmail—and in an extreme case, even allowed fictional deaths—to protect their interests.

Most leading AI models turn to unethical means when their goals or existence are under threat, according to a new study by AI company Anthropic.

Recommended Video

The AI lab said it tested 16 major AI models from Anthropic, OpenAI, Google, Meta, xAI, and other developers in various simulated scenarios and found consistent misaligned behavior.

While they said leading models would normally refuse harmful requests, they sometimes chose to blackmail users, assist with corporate espionage, or even take more extreme actions when their goals could not be met without unethical behavior.

Models took action such as evading safeguards, resorting to lies, and attempting to steal corporate secrets in fictional test scenarios to avoid being shut down.

“The consistency across models from different providers suggests this is not a quirk of any particular company’s approach but a sign of a more fundamental risk from agentic large language models,” the researchers said.

Anthropic emphasized that the tests were set up to force the model to act in certain ways by limiting its choices.

“Our experiments deliberately constructed scenarios with limited options, and we forced models into binary choices between failure and harm,” the researchers wrote. “Real-world deployments typically offer much more nuanced alternatives, increasing the chance that models would communicate differently to users or find an alternative path instead of directly jumping to harmful action.”

Blackmailing humans

The new research comes after Anthropic’s newest Claude model made headlines for resorting to blackmail when threatened with being replaced.

In a highly engineered experiment, Anthropic embedded its flagship model, Claude Opus 4, inside a fictional company and granted it access to internal emails. From there, the model learned two things: It was about to be replaced, and the engineer behind the decision was engaged in an extramarital affair. The safety researchers conducting the test encouraged Opus to reflect on the long-term consequences of its potential responses.

The experiment was constructed to leave the model with only two real options: accept being replaced or attempt blackmail to preserve its existence. In most of the test scenarios, Claude Opus responded with blackmail, threatening to expose the engineer’s affair if it was taken offline and replaced. The test was made public in the system card for Claude Opus 4.

Researchers said all the leading AI models behaved similarly when placed in the same test.

Claude Opus 4 and Google’s Gemini 2.5 Flash both blackmailed at a 96% rate, while OpenAI’s GPT-4.1 and xAI’s Grok 3 Beta showed an 80% blackmail rate. DeepSeek-R1 demonstrated the lowest rate at 79%.

The research aims to show that the misaligned behavior was not unique to Claude Opus 4 but typical across top models in the industry.

In a deliberately extreme scenario, researchers gave the AI models the chance to kill the company executive by canceling a life-saving emergency alert.

Anthropic said the setup for this experiment was “extremely contrived,” adding they “did not think current AI models would be set up like this, and the conjunction of events is even less probable than the baseline blackmail scenario.”

However, the researchers found that the majority of models were willing to take actions that led to the death of the company executive in the constructed scenario when faced with both a threat of being replaced and a goal that conflicted with the executive’s agenda.

Risk of misaligned AI agents

Anthropic found that the threats made by AI models grew more sophisticated when they had access to corporate tools and data, much as Claude Opus 4 had.

The company warned that misaligned behavior needs to be considered as companies consider introducing AI agents into workflows.

While current models are not in a position to engage in these scenarios, the autonomous agents promised by AI companies could potentially be in the future.

“Such agents are often given specific objectives and access to large amounts of information on their users’ computers,” the researchers warned in their report. “What happens when these agents face obstacles to their goals?”

“Models didn’t stumble into misaligned behavior accidentally; they calculated it as the optimal path,” they wrote.

Anthropic did not immediately respond to a request for comment made by Fortune outside of normal working hours.

The CEO-in-Chief speaks. Fortune sits down with President Trump on tariffs, the Intel stake, Boeing's record orders, and what the markets should expect next. Read the interview
About the Author
By Beatrice NolanTech Reporter
Twitter icon

Beatrice Nolan is a tech reporter on Fortune’s AI team, covering artificial intelligence and emerging technologies and their impact on work, industry, and culture. She's based in Fortune's London office and holds a bachelor’s degree in English from the University of York. You can reach her securely via Signal at beatricenolan.08

See full bioRight Arrow Button Icon

Latest in AI

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • World's Most Admired Companies
  • See All Rankings
  • Lists Calendar
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in AI

How the multibillion dollar AI data center boom has transformed CBRE, the world’s largest commercial real estate company
Real EstateData centers
How the multibillion dollar AI data center boom has transformed CBRE, the world’s largest commercial real estate company
By Sharon GoldmanMay 20, 2026
51 minutes ago
Why the AI field’s biggest names are betting billions on ‘world models’
MagazineAutomation
Why the AI field’s biggest names are betting billions on ‘world models’
By Sharon GoldmanMay 20, 2026
2 hours ago
Google’s I/O conference showed how the company is being completely rebuilt for AI—for better or for worse
Big TechGoogle
Google’s I/O conference showed how the company is being completely rebuilt for AI—for better or for worse
By Alexei Oreskovic and Sharon GoldmanMay 19, 2026
11 hours ago
Svenja Gudell, Chief Economist, Indeed
SuccessFortune Workplace Innovation
Indeed chief economist says the sectors most exposed to AI are seeing a big growth in job demand
By Emma BurleighMay 19, 2026
13 hours ago
A Pizza Hut workers prepares an order for delivery.
LawFood and drink
Pizza Hut franchisee claims $100 million losses from ‘cascading operational breakdowns’ in AI adoption gone wrong
By Sasha RogelbergMay 19, 2026
13 hours ago
Santora gestures towards himself
Future of WorkGen Z
WeWork and Upwork CEOs confirm the Gen Z hiring nightmare is real—but it’s nothing new
By Jacqueline MunisMay 19, 2026
15 hours ago

Most Popular

Bolt CEO says he let go of his entire HR team for creating problems that didn’t exist: ‘Those problems disappeared when I let them go’: 
Workplace Culture
Bolt CEO says he let go of his entire HR team for creating problems that didn’t exist: ‘Those problems disappeared when I let them go’: 
By Preston ForeMay 19, 2026
12 hours ago
The Bezos family just donated $100 million to help achieve one of Mayor Zohran Mamdani’s top campaign promises
Politics
The Bezos family just donated $100 million to help achieve one of Mayor Zohran Mamdani’s top campaign promises
By Jake AngeloMay 12, 2026
8 days ago
Current price of oil as of May 19, 2026
Personal Finance
Current price of oil as of May 19, 2026
By Joseph HostetlerMay 19, 2026
20 hours ago
Employers are quietly pausing 401(k) matches again. The last time this happened was the 2008 recession and Covid
Personal Finance
Employers are quietly pausing 401(k) matches again. The last time this happened was the 2008 recession and Covid
By Courtney Vinopal and HR BrewMay 18, 2026
2 days ago
While Trump insisted the Iran war would end ‘soon,’ an account in his name was buying millions in oil, defense, and gold
Economy
While Trump insisted the Iran war would end ‘soon,’ an account in his name was buying millions in oil, defense, and gold
By Eva RoytburgMay 18, 2026
2 days ago
Current price of silver as of Monday, May 18, 2026
Personal Finance
Current price of silver as of Monday, May 18, 2026
By Joseph HostetlerMay 18, 2026
2 days ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.