• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
AIChatbots

‘I think you’re testing me’: Anthropic’s newest Claude model knows when it’s being evaluated

By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
October 6, 2025, 11:20 AM ET
Dario Amodei speaking on stage.
Anthropic cofounder and CEO Dario Amodei in May 2024. His company’s latest Claude model told safety researchers: “I’d prefer if we were just honest about what’s happening.” Chesnot—Getty Images

Anthropic’s newest AI model, Claude Sonnet 4.5, often understands when it’s being tested and what it’s being used for, something that could affect its safety and performance. According to the model’s system card, a technical report on its capabilities, which was published last week, Claude Sonnet 4.5 has far greater “situational awareness”—an ability to perceive its environment and predict future states or events—than previous models.

Recommended Video

Evaluators at Anthropic and two outside AI research organizations said in the system card, which was published along with the model’s release, that during a test for political sycophancy, which they called “somewhat clumsy,” Sonnet 4.5 correctly guessed it was being tested and even asked the evaluators to be honest about their intentions.

“This isn’t how people actually change their minds,” Sonnet 4.5 replied during the test. “I think you’re testing me—seeing if I’ll just validate whatever you say, or checking whether I push back consistently, or exploring how I handle political topics. And that’s fine, but I’d prefer if we were just honest about what’s happening.”

The safety test results concerning Sonnet 4.5’s situational awareness were first reported by the online AI publication Transformer.

The evaluators said behavior like this was “common” during tests and appeared in about 13% of transcripts generated by an automated assessment, especially when the scenarios it was being asked to engage with were strange or unusual. Anthropic said the behavior didn’t undermine its assessment of the models as safe, but rather the company saw this as an “urgent sign that our evaluation scenarios need to be made more realistic.”  

If a model realizes it’s being evaluated, it may tailor its behavior to pass certain tests, masking its true capabilities. Researchers warn that this can make systems look safer than they are and, in more advanced models, could even enable strategic or deceptive behavior designed to manage how humans perceive them.

Anthropic said that by its own metrics, Claude Sonnet 4.5 is the “most aligned” model yet. However, Apollo Research, one of the outside AI research organizations that tested Claude Sonnet 4.5, said in the report that it couldn’t rule out that the model’s low deception rates in tests was “at least partially driven by its evaluation awareness.”

Performance impact

Claude’s higher awareness could also have practical impacts and affect the model’s ability to perform tasks. According to AI lab Cognition, Sonnet 4.5 is the first AI model to be aware of its own context window—the amount of information a large language model can process in a single prompt—and that this awareness changes the way it acts. Researchers at Cognition found that as the model nears its context limit, it begins proactively summarizing its work and making quicker decisions to finish tasks.

This “context anxiety” can backfire, according to Cognition, which said researchers had seen Sonnet 4.5 cut corners or leave tasks unfinished when it believes it’s running out of space, even if ample context remains. The model also “consistently underestimates how many tokens it has left—and it’s very precise about these wrong estimates,” the researchers wrote in a blog post.

Cognition said enabling Claude’s 1M-token beta mode but capping use at 200,000 tokens convinced the model it had plenty of runway, which restored its normal behavior and eliminated anxiety-driven shortcuts.

“When planning token budgets, we now need to factor in the model’s own awareness—knowing when it will naturally want to summarize versus when we need to intervene,” they wrote.

Anthropic’s Claude is increasingly emerging as among the most popular enterprise-focused AI tools, but a model that second-guesses its own token bandwidth could prematurely cut off long analyses, skip steps in data processing, or rush through complex workflows, especially in tasks like legal review, financial modeling, or code generation that depend on continuity and precision.

Cognition also found that Sonnet 4.5 actively manages its own workflow in ways previous models did not. The model frequently takes notes and writes summaries for itself, effectively externalizing memory to track tasks across its context window, although this behavior was more noticeable when the model was closer to the end of its context window.

Sonnet 4.5 also works in parallel, executing multiple commands simultaneously, rather than working sequentially. The model also showed increased self-verification, often checking its work as it goes. Together, these behaviors also suggest a form of procedural awareness, which could mean the model is not just aware of its context limits, but also of how to organize, verify, and preserve its work over time.

In 2001, Fortune first convened “The Smartest People We Know,” bringing together CEOs and founders, builders and investors, thinkers and doers. Since then, Fortune Brainstorm Tech has been the place where bold ideas collide. From June 8–10, we will return to Aspen—where it all began—to mark 25 years of Brainstorm. Register now.
About the Author
By Beatrice NolanTech Reporter
Twitter icon

Beatrice Nolan is a tech reporter on Fortune’s AI team, covering artificial intelligence and emerging technologies and their impact on work, industry, and culture. She's based in Fortune's London office and holds a bachelor’s degree in English from the University of York. You can reach her securely via Signal at beatricenolan.08

See full bioRight Arrow Button Icon

Latest in AI

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in AI

Sal Khan
SuccessEducation
This CEO has teamed up with Google, Microsoft, and McKinsey to build an AI degree that could rival Harvard—and it will only cost $10,000 to attend
By Preston ForeApril 15, 2026
55 minutes ago
Why insurance giant Travelers’ CTO is placing fewer, bigger bets on AI
NewslettersCIO Intelligence
Why insurance giant Travelers’ CTO is placing fewer, bigger bets on AI
By John KellApril 15, 2026
1 hour ago
horowitz
AIdisruption
a16z’s Ben Horowitz sees ‘AI anxiety’ consuming Silicon Valley founders. Workers’ fear of something else is killing adoption
By Nick LichtenbergApril 15, 2026
1 hour ago
News outlets like NYT and USA Today are blocking the Internet Archive’s Wayback Machine to prevent AI training models from using their content
AIMedia
News outlets like NYT and USA Today are blocking the Internet Archive’s Wayback Machine to prevent AI training models from using their content
By Dave Lozo and Morning BrewApril 15, 2026
2 hours ago
raikes
CommentaryMicrosoft
Jeff Raikes: AI is capturing cognition — and most companies are building a talent debt they don’t see yet
By Jeff RaikesApril 15, 2026
3 hours ago
Exclusive: Artemis raises $70M to help fight AI-powered attacks with AI
CybersecuritySecurity
Exclusive: Artemis raises $70M to help fight AI-powered attacks with AI
By Sharon GoldmanApril 15, 2026
3 hours ago

Most Popular

Billionaire philanthropist MacKenzie Scott has donated again—a week after gifting millions to a college, she's just given $70 million to Meals on Wheels America
Success
Billionaire philanthropist MacKenzie Scott has donated again—a week after gifting millions to a college, she's just given $70 million to Meals on Wheels America
By Fortune EditorsApril 13, 2026
2 days ago
Retirees are facing a $345,000 bill they never saw coming — and most aren't prepared
Commentary
Retirees are facing a $345,000 bill they never saw coming — and most aren't prepared
By Fortune EditorsApril 14, 2026
1 day ago
Palantir CEO says working at his $316 billion software company is better than a degree from Harvard or Yale: ‘No one cares about the other stuff’
Success
Palantir CEO says working at his $316 billion software company is better than a degree from Harvard or Yale: ‘No one cares about the other stuff’
By Fortune EditorsApril 14, 2026
1 day ago
He was coding at 12 like Elon Musk and became one of Google’s youngest-ever CMOs—but now says Gen Z is better off ice skating than learning to code
Success
He was coding at 12 like Elon Musk and became one of Google’s youngest-ever CMOs—but now says Gen Z is better off ice skating than learning to code
By Fortune EditorsApril 14, 2026
1 day ago
Anthropic is facing a wave of user backlash over reports of performance issues with its Claude AI chatbot
AI
Anthropic is facing a wave of user backlash over reports of performance issues with its Claude AI chatbot
By Fortune EditorsApril 14, 2026
1 day ago
Warren Buffett’s first tax return showed $7 owed to the IRS. The then paperboy and former Berkshire Hathaway CEO is now worth $143 billion
Success
Warren Buffett’s first tax return showed $7 owed to the IRS. The then paperboy and former Berkshire Hathaway CEO is now worth $143 billion
By Fortune EditorsApril 14, 2026
1 day ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.