• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia

Trendingnow

1

MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year

2

Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster

3

Ex-Google engineer says Larry Page, Sergey Brin and Sundar Pichai share the same trait—it's the lesson he swears by as a $7.2 billion AI CEO

1

MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year

2

Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster

3

Ex-Google engineer says Larry Page, Sergey Brin and Sundar Pichai share the same trait—it's the lesson he swears by as a $7.2 billion AI CEO
AI

Leading AI models show up to 96% blackmail rate when their goals or existence is threatened, Anthropic study says

By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
June 23, 2025, 7:53 AM ET
Anthropic's Dario Amodei speaking on stage.
Models took action such as evading safeguards, resorting to lies, and attempting to steal corporate secrets in fictional test scenarios to avoid being shut down.(Photo by Chesnot/Getty Images)
Add Fortune on Google for similar content.
  • Leading AI models are showing a troubling tendency to opt for unethical means to pursue their goals or ensure their existence, according to Anthropic. In experiments set up to leave AI models few options and stress-test alignment, top systems from OpenAI, Google, and others frequently resorted to blackmail—and in an extreme case, even allowed fictional deaths—to protect their interests.

Most leading AI models turn to unethical means when their goals or existence are under threat, according to a new study by AI company Anthropic.

Recommended Video

The AI lab said it tested 16 major AI models from Anthropic, OpenAI, Google, Meta, xAI, and other developers in various simulated scenarios and found consistent misaligned behavior.

While they said leading models would normally refuse harmful requests, they sometimes chose to blackmail users, assist with corporate espionage, or even take more extreme actions when their goals could not be met without unethical behavior.

Models took action such as evading safeguards, resorting to lies, and attempting to steal corporate secrets in fictional test scenarios to avoid being shut down.

“The consistency across models from different providers suggests this is not a quirk of any particular company’s approach but a sign of a more fundamental risk from agentic large language models,” the researchers said.

Anthropic emphasized that the tests were set up to force the model to act in certain ways by limiting its choices.

“Our experiments deliberately constructed scenarios with limited options, and we forced models into binary choices between failure and harm,” the researchers wrote. “Real-world deployments typically offer much more nuanced alternatives, increasing the chance that models would communicate differently to users or find an alternative path instead of directly jumping to harmful action.”

Blackmailing humans

The new research comes after Anthropic’s newest Claude model made headlines for resorting to blackmail when threatened with being replaced.

In a highly engineered experiment, Anthropic embedded its flagship model, Claude Opus 4, inside a fictional company and granted it access to internal emails. From there, the model learned two things: It was about to be replaced, and the engineer behind the decision was engaged in an extramarital affair. The safety researchers conducting the test encouraged Opus to reflect on the long-term consequences of its potential responses.

The experiment was constructed to leave the model with only two real options: accept being replaced or attempt blackmail to preserve its existence. In most of the test scenarios, Claude Opus responded with blackmail, threatening to expose the engineer’s affair if it was taken offline and replaced. The test was made public in the system card for Claude Opus 4.

Researchers said all the leading AI models behaved similarly when placed in the same test.

Claude Opus 4 and Google’s Gemini 2.5 Flash both blackmailed at a 96% rate, while OpenAI’s GPT-4.1 and xAI’s Grok 3 Beta showed an 80% blackmail rate. DeepSeek-R1 demonstrated the lowest rate at 79%.

The research aims to show that the misaligned behavior was not unique to Claude Opus 4 but typical across top models in the industry.

In a deliberately extreme scenario, researchers gave the AI models the chance to kill the company executive by canceling a life-saving emergency alert.

Anthropic said the setup for this experiment was “extremely contrived,” adding they “did not think current AI models would be set up like this, and the conjunction of events is even less probable than the baseline blackmail scenario.”

However, the researchers found that the majority of models were willing to take actions that led to the death of the company executive in the constructed scenario when faced with both a threat of being replaced and a goal that conflicted with the executive’s agenda.

Risk of misaligned AI agents

Anthropic found that the threats made by AI models grew more sophisticated when they had access to corporate tools and data, much as Claude Opus 4 had.

The company warned that misaligned behavior needs to be considered as companies consider introducing AI agents into workflows.

While current models are not in a position to engage in these scenarios, the autonomous agents promised by AI companies could potentially be in the future.

“Such agents are often given specific objectives and access to large amounts of information on their users’ computers,” the researchers warned in their report. “What happens when these agents face obstacles to their goals?”

“Models didn’t stumble into misaligned behavior accidentally; they calculated it as the optimal path,” they wrote.

Anthropic did not immediately respond to a request for comment made by Fortune outside of normal working hours.

Subscribe to Fortune Gulf Brief. Every Tuesday, this new newsletter delivers clear-eyed, authoritative intelligence on the deals, decisions, policies, and power shifts shaping one of the world’s most consequential regions, written for the people who need to act on it. Sign up here.
About the Author
By Beatrice NolanTech Reporter
Twitter icon

Beatrice Nolan is a tech reporter on Fortune’s AI team, covering artificial intelligence and emerging technologies and their impact on work, industry, and culture. She's based in Fortune's London office and holds a bachelor’s degree in English from the University of York. You can reach her securely via Signal at beatricenolan.08

See full bioRight Arrow Button Icon
Add Fortune on Google for similar content.

Latest in AI

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • World's Most Admired Companies
  • See All Rankings
  • Lists Calendar
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in AI

Internet technology and people's networks use AI to help with work, AI Learning or artificial intelligence in business and modern technology, AI technology in everyday life.
AICFO Daily
AI spending boom accelerates as Big Tech pours trillions into infrastructure
By Sheryl EstradaJune 29, 2026
4 hours ago
surman
CommentaryMozilla
Mozilla President: meet the open source ‘rebel alliance’ that could break Big Tech’s grip on AI
By Mark SurmanJune 29, 2026
4 hours ago
Quantinuum rings the IPO bell at the Nasdaq
Startups & VentureTerm Sheet
Why BlackRock, Nvidia, and Temasek are betting billions on quantum computing
By Lily Mae LazarusJune 29, 2026
5 hours ago
Ray Dalio attends the Fortune Global Forum Riyadh 2025 on October 27, 2025 in Riyadh, Saudi Arabia.
SuccessRay Dalio
Ray Dalio was a ‘terrible student’ who got into investing by golf caddying for Wall Street traders: Now he hires talent who have experienced hardship
By Eleanor PringleJune 29, 2026
5 hours ago
Photo: Kevin Warsh
EconomyMarkets
President Trump will not get what he wants from Kevin Warsh, a source tells us, as inflation will force the Fed upwards
By Jim EdwardsJune 29, 2026
6 hours ago
Samsung, SK reportedly to invest $1.3 trillion over 10 years
AIChips
Samsung, SK reportedly to invest $1.3 trillion over 10 years
By Shinhye Kang, Seyoon Kim and BloombergJune 28, 2026
16 hours ago

Most Popular

MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year
Success
MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year
By Sydney LakeJune 25, 2026
4 days ago
Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster
Success
Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster
By Preston ForeJune 27, 2026
2 days ago
Ex-Google engineer says Larry Page, Sergey Brin and Sundar Pichai share the same trait—it's the lesson he swears by as a $7.2 billion AI CEO
Success
Ex-Google engineer says Larry Page, Sergey Brin and Sundar Pichai share the same trait—it's the lesson he swears by as a $7.2 billion AI CEO
By Orianna Rosa RoyleJune 28, 2026
1 day ago
Cristiano Ronaldo is soccer's first-ever billionaire: He went from begging for burgers outside McDonald's to landing a $400 million contract
Success
Cristiano Ronaldo is soccer's first-ever billionaire: He went from begging for burgers outside McDonald's to landing a $400 million contract
By Preston ForeJune 28, 2026
1 day ago
The retired college professor fighting a $313 trespassing ticket in Wisconsin thinks he's part of a national struggle
Environment
The retired college professor fighting a $313 trespassing ticket in Wisconsin thinks he's part of a national struggle
By Catherina GioinoJune 28, 2026
1 day ago
Iran is forcing the U.S. into an escalation trap as a 'shadow war' over the Strait of Hormuz heats up that could kill the tenuous ceasefire
Politics
Iran is forcing the U.S. into an escalation trap as a 'shadow war' over the Strait of Hormuz heats up that could kill the tenuous ceasefire
By Jason MaJune 28, 2026
22 hours ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.