• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
AIChatbots

We studied chatbots and language and saw a huge problem: They mean 80% when they say ‘likely’ but humans hear 65%

By
Mayank Kejriwal
Mayank Kejriwal
and
The Conversation
The Conversation
Down Arrow Button Icon
By
Mayank Kejriwal
Mayank Kejriwal
and
The Conversation
The Conversation
Down Arrow Button Icon
February 25, 2026, 2:30 AM ET
gen z
What does this chatbot mean, really.Getty Images

When a human says an event is “probable” or “likely,” people generally have a shared, if fuzzy, understanding of what that means. But when an AI chatbot like ChatGPT uses the same word, it’s not assessing the odds the way we do, my colleagues and I found.

Recommended Video

We recently published a study in the journal NPJ Complexity that suggests that, while large language model AIs excel at conversation, they often fail to align with humans when communicating uncertainty. The research focused on words of estimative probability, which include terms like “maybe,” “probably” and “almost certain.”

By comparing how AI models and humans map these words to numerical percentages, we uncovered significant gaps between humans and large language models. While the models do tend to agree with humans on extremes like “impossible,” they diverge sharply on hedge words like “maybe.” For example, a model might use the word “likely” to represent an 80% probability, while a human reader assumes it means closer to 65%.

This could be because humans can interpret words such as “likely” and “probable” based more on contextual cues and personal experiences. In contrast, large language models may be averaging over conflicting usages of those words in their training data, leading to divergences with human interpretations.

Our study also found that large language models are sensitive to gendered language and the specific language used for prompting. When a prompt changed from “he” to “she,” the AI’s probability estimates often became more rigid, reflecting biases embedded in its training data. When a prompt changed from English to Chinese, the AI’s probability estimates often shifted, possibly due to differences between English and Chinese in how people express and understand uncertainty.

a multicolor three-pane graphic with icons representing humans and robots, and text and arrows
AI chatbots don’t interpret ‘probably’ and ‘maybe’ the same way you do. Mayank Kejriwal

Why it matters

Far from being a linguistic quirk, this misalignment is a fundamental challenge for AI safety and human-AI interaction. As large language models are increasingly used in high-stakes fields like health care, government policy and scientific reporting, the way they communicate risk becomes a matter of public trust.

If an AI assistant helping a doctor, for instance, describes a side effect as “unlikely,” but the model’s internal calculation of “unlikely” is much higher than the doctor’s interpretation, the resulting decision could be flawed.

What other research is being done

Scientists have studied how humans quantify uncertainty since the 1960s, a field pioneered by CIA analysts to improve intelligence reporting. More recently, there has been an explosion in large language model literature seeking to look under the hood of neural networks to better understand their “behaviors” and linguistic patterns.

Our study adds a layer of complexity by treating the interaction between humans and artificial intelligence as a biological-like system where meaning can degrade. It moves beyond simply measuring if an AI is “smart” and instead asks if it is aligned.

Other researchers are currently exploring whether so-called chain-of-thought prompting – asking the AI to show its work – can fix these errors. However, our study found that even advanced reasoning doesn’t always bridge the gap between statistical data and verbal labels.

What’s next

A goal for future AI development is to create models that don’t just predict the next likely word but actually understand the weight of the uncertainty they are conveying. Researchers are calling for more robust consistency metrics to ensure that if a model sees a 10% chance in the data, it chooses the same word every time.

As we move toward a world where AI summarizes scientific papers and manages people’s schedules, making sure that “probably” means “probably” is a vital step in making these systems reliable partners rather than just sophisticated parrots.

The Research Brief is a short take on interesting academic work.

Mayank Kejriwal, Research Assistant Professor of Industrial & Systems Engineering, University of Southern California

This article is republished from The Conversation under a Creative Commons license. Read the original article.

The Conversation
In 2001, Fortune first convened “The Smartest People We Know,” bringing together CEOs and founders, builders and investors, thinkers and doers. Since then, Fortune Brainstorm Tech has been the place where bold ideas collide. From June 8–10, we will return to Aspen—where it all began—to mark 25 years of Brainstorm. Register now.
About the Authors
By Mayank Kejriwal
See full bioRight Arrow Button Icon
By The Conversation
See full bioRight Arrow Button Icon

Latest in AI

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
Fortune Secondary Logo
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in AI

Nvidia founder and CEO Jensen Huang
AIEye on AI
Forget basketball. Next week’s Nvidia GTC is the real March Madness for AI
By Sharon GoldmanMarch 12, 2026
23 minutes ago
altman
AIOpenAI
Sam Altman admits AI is killing the labor-capital balance—and says nobody knows what to do about it
By Nick LichtenbergMarch 12, 2026
3 hours ago
altman
AIProductivity
‘What will our kids do?’: One question was on every investor’s lips at Morgan Stanley’s big AI conference
By Nick LichtenbergMarch 12, 2026
3 hours ago
dario
CommentaryAnthropic
Anthropic just sued the Pentagon. The outcome could reshape the AI race with China
By Mark MinevichMarch 12, 2026
4 hours ago
Copilot logo on a phone.
AIHealth
Microsoft launches Copilot Health, a dedicated space for personal health data and AI-driven insights
By Beatrice NolanMarch 12, 2026
4 hours ago
ruba
CommentaryAmazon Web Services
Most AI investments fail—here’s what the winners get right 
By Ruba BornoMarch 12, 2026
4 hours ago

Most Popular

placeholder alt text
Economy
'This cannot be sustainable': The U.S. borrowed $50 billion a week for the past five months, the CBO says
By Eleanor PringleMarch 10, 2026
2 days ago
placeholder alt text
AI
'Proceed with caution': Elon Musk offers warning after Amazon reportedly held mandatory meeting to address 'high blast radius' AI-related incident
By Sasha RogelbergMarch 11, 2026
21 hours ago
placeholder alt text
Commentary
How the ultrawealthy use smartphone apps to avoid millions in taxes
By Jose AtilesMarch 11, 2026
1 day ago
placeholder alt text
Future of Work
Shark Tank's Kevin O'Leary doesn't care if you work from your basement. He just wants to know if you can ‘execute’
By Marco Quiroz-GutierrezMarch 10, 2026
2 days ago
placeholder alt text
Personal Finance
Retirees wait for the day they can sell their homes and cash in—but there's a secret Medicare 'trap' that could stop them in their tracks
By Sydney LakeMarch 11, 2026
1 day ago
placeholder alt text
Success
BlackRock is splashing $100 million on training plumbers, electricians, and HVAC technicians as its CEO flags a skilled trade worker shortage
By Preston ForeMarch 11, 2026
1 day ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.