• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
AIOpenAI

AI agents that do your work while you sleep sound great. The reality is far messier—‘it’s like a toddler that needs to be overseen’

Sharon Goldman
By
Sharon Goldman
Sharon Goldman
AI Reporter
Down Arrow Button Icon
Sharon Goldman
By
Sharon Goldman
Sharon Goldman
AI Reporter
Down Arrow Button Icon
February 23, 2026, 2:33 PM ET
Drawing of a toddler writing on the wall while her mother hides her eyes
Getty; nicoletaionescu

Summer Yue may work on safety and alignment on Meta’s superintelligence team, but even she admits she isn’t immune to overconfidence when it comes to autonomous AI agents. 

Recommended Video

In a post on X Monday, Yue described how her OpenClaw autonomous AI agents—built to run locally on a Mac mini computer—deleted her entire inbox, ignoring instructions to pause and ask for confirmation first.

“I had to RUN to my Mac Mini like I was defusing a bomb,” she said. It was, she added, a “rookie mistake.” The workflow had been working in a test inbox she used to safely trial the agent for weeks, she explained, but in the real inbox the agent lost her original instruction. 

Yue’s experience stands in stark contrast to viral posts such as The Lobster Revolution: Why 24/7 AI Agents Just Changed Everything, in which Peter Diamandis claims always-on AI is far more frictionless. 

“Let me tell you what it feels like to use this,” Diamandis wrote. “You wake up in the morning and your agent—mine is named Skippy, cheerfully sarcastic and absurdly capable—has done eight hours of work while you slept. It read a thousand pages of markdown. It organized your files. It drafted three project plans. It booked your travel. It researched that question you had at 11 PM and forgot about.”

“When my Mac mini went offline for six hours, I felt withdrawal,” he added. “Like my best friend disappeared.”

Together, these dueling accounts of the power of AI agents capture the tension at the heart of today’s push toward “always-on” AI. As tools like OpenClaw and Claude Code make it technically possible for agents to run for long periods, excitement is growing around the idea of AI that works while you sleep. But in practice, early users say that autonomy remains fragile, unpredictable, and labor-intensive to manage. Rather than replacing human work, today’s agents often require constant monitoring, guardrails, and intervention, especially when the stakes rise beyond low-risk experiments.

AI agents work best when tasks are simple and low-stakes

Shyamal Anadkat, who previously worked as an applied AI engineer at OpenAI, said most of today’s successful agents still require frequent human check-ins or are limited to tightly bounded, well-defined tasks—though he emphasized that this will change as measurement and evaluation techniques improve.

“A system that’s 95% accurate on individual steps becomes chaotic over a 20-step autonomous workflow,” Anadkat said. “Long-horizon planning is still weak.” As a result, he explained, agents may perform well on short task chains but tend to fall apart when asked to manage complex, multiday projects. Memory is another major limitation: “In many agents, memory is either nonexistent or fragile. You need systems that can maintain a coherent model of your work context, priorities, and constraints.”

That doesn’t mean the promise of AI agents is all smoke and mirrors, according to Yoav Shoham, a former principal scientist at Google, a professor emeritus at Stanford, and cofounder of AI21 Labs. But it does mean there is the danger of people getting ahead of themselves. Today’s AI agents, he explained, work best when the task is low-risk, loosely defined, and cheap to get wrong.

“Developers like toys, and you have this toy that can do wonderful things,” he told Fortune. “As long as what they’re doing is fairly simple and fairly low stakes with high tolerance for error, that’s fine.” For example, if you wanted your agent to read 10,000 websites and do something interesting with the results to give you tidbits of information overnight that could be useful.

But for mission-critical enterprise workflows, the bar is much higher. Companies need systems that are verifiable, repeatable, and cost-effective—requirements that quickly erode the set-it-and-forget-it promise of fully autonomous, always-on agents. In highly structured domains like coding or math, deeper automation is already possible. But for most real-world business processes, Shoham says, the work required to make agents reliable often outweighs the benefit.

Bret Greenstein, chief AI officer at consulting firm West Monroe, pointed out that tools like OpenClaw feel like a tipping point similar to what happened with generative AI when ChatGPT launched in 2022—for the first time, it has made the idea of AI agents accessible. Still, it’s not a 24/7  “magic solution.” 

“It can work for a long time, cranking away on things, but it’s like a toddler that needs to be overseen,” he said. Some tasks are reasonable to do while you are sleeping, like scanning LinkedIn messages or tracking news. “I’m not sure I would have it answering customer feedback while I’m sleeping,” he said. 

Ability to delegate to an AI agent feels powerful

Still, there is little doubt that the ability to delegate real-world tasks to an AI agent is deeply compelling for users, Greenstein emphasized. He pointed to his own experience handing an AI agent the mundane task of getting his clothes picked up to be dry cleaned—and watching it quietly complete the job end to end.

The agent independently contacted the cleaner, worked out pickup logistics through email exchanges, coordinated timing, monitored a doorbell camera to confirm the pickup, and notified Greenstein once the task was complete. The episode illustrated how agents can operate across multiple systems and adapt when things don’t go as planned. But it also underscored why such tools still require strict guardrails and oversight—especially before they are deployed in enterprise settings.

“OpenClaw is set up so it shouldn’t feel safe for most people,” Greenstein said. “It doesn’t feel mature enough to be a trusted part of our lives yet.” For AI to be welcomed into everyday life or business operations, he added, it has to earn trust over time—much the way trust is established socially.

Even so, demand is already evident. Greenstein pointed to meetups and early industry gatherings dedicated to OpenClaw, a rapid emergence he described as unusual for such a young tool. “It shows the hunger people have for AI that’s actually useful,” he said—systems that move beyond answering questions and start taking action.

Aaron Levie, CEO of cloud-based content management and collaboration company Box, called what is happening now with AI agents “little glimmers” of what might happen in the future. 

“Some glimmers end up not manifesting, some glimmers just become the standard,” he explained, pointing to two years ago when AI company Cognition introduced an early agent called Devin that would integrate with Slack for task delegation, bug fixes, data analysis, and code review. At the time, it was still seen as futuristic, but today, “no one is confused that this is a standard practice,” he said. “You can just Slack Claude Code to go work on stuff—what seemed like a totally crazy idea is now basically the standard of any modern engineering team.” 

But while AI agents are becoming very good at automating specific, discrete tasks, they remain poor at handling the broader, context-heavy work that makes up most jobs, Levie emphasized. AI agents may fully automate a handful of tasks, but struggle with the rest—including navigating relationships and participating in meetings. 

“When you hear an AI lab say we’re going to automate all knowledge work in 24 months, that’s usually a very narrow definition of jobs,” he said. “The definition of what an agent can do is not the same definition of what the job is that gets hired in the economy.” 

The trust factor matters for when things can go wrong

Avinash Vootkuri, a staff data scientist at a top Fortune 500 retailer, said that most enterprise AI agents “absolutely require a babysitter” and, for now, can work only in enterprise settings with tightly bounded autonomy and extensive guardrails. “The stakes are massive,” he explained. 

For example, he described building an agentic system for enterprise cybersecurity where AI agents don’t simply trigger alerts and wait for human review but actively investigate them. Instead of flooding analysts with thousands of warnings, the agents gather evidence in real time—querying threat-intelligence databases, analyzing behavioral patterns, and filtering out false positives—before deciding whether a situation warrants escalation. 

The system relies on tightly bounded autonomy and extensive guardrails, reducing human workload without removing oversight.

In cybersecurity, he explained, if the agent gets it wrong, the consequences are immediate and severe. “The AI either blocks legitimate customers (causing massive revenue loss) or it lets a sophisticated threat actor into the network,” he said. “It absolutely matters if things go wrong.” 

According to Breeanna Whitehead, who runs an AI operations consultancy where she builds AI-powered systems for executives and founders, the industry is in a “trust calibration phase.” 

AI agents can do more than most people let them, but less than the hype suggests. 

“The real skill isn’t building the agent—it’s designing the handoff,” she explained. “Most people either over-trust agents and end up cleaning up messes, or they micromanage every output and wonder why AI feels like more work instead of less.” The idea, she said, is to design clear handoff points, where something might be fully delegated, another thing might get a quick review, while another task stays just for humans to do. 

For now, she said, agents are “genuinely excellent” what she called the middle layer of knowledge work — “the stuff that used to eat two to three hours of a smart person’s day, like synthesizing meeting notes into action items, drafting follow-up emails in someone’s voice, pulling together research briefs, organizing competing priorities into a clear plan.” 

But anything that requires reading a room, navigating ambiguity, or making judgment calls that depend on relationships are not ready for AI agent prime time. “I had a client who wanted to fully automate their investor communications,” she said. “The AI could draft beautifully, but it couldn’t sense when a funder was losing interest and needed a different approach. The agent drafted the email, but the human had to decide whether to send it.” 

For now, sleep may be elusive when working with AI agents

For now, working with AI agents may have less to do with sleeping while they work than with staying half-awake while they do. Tools like OpenClaw can run for hours at a time, but for many early users, that autonomy comes with a new kind of vigilance—checking logs, reviewing outputs, and stepping in before things go wrong.

That dynamic was captured in a recent viral post titled Token Anxiety, in which investor Nikunj Kothari described a friend leaving a party early—not because he was tired, but because he wanted to get back to his agents. “Nobody questions it anymore,” Kothari wrote. “Half the room is thinking the same thing. The other half are probably checking the progress of their agents. At a party.”

The dream of AI that works while you sleep may be real. But for now, it’s still keeping a lot of people awake.

Join us at the Fortune Workplace Innovation Summit May 19–20, 2026, in Atlanta. The next era of workplace innovation is here—and the old playbook is being rewritten. At this exclusive, high-energy event, the world’s most innovative leaders will convene to explore how AI, humanity, and strategy converge to redefine, again, the future of work. Register now.
About the Author
Sharon Goldman
By Sharon GoldmanAI Reporter
LinkedIn icon

Sharon Goldman is an AI reporter at Fortune and co-authors Eye on AI, Fortune’s flagship AI newsletter. She has written about digital and enterprise tech for over a decade.

See full bioRight Arrow Button Icon

Latest in AI

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
Fortune Secondary Logo
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in AI

AIOpenAI
OpenAI changed its mission statement 6 times in 9 years. It finally removed the word “safely” as a core value when it restructured into a for-profit
By Catherina GioinoFebruary 23, 2026
13 minutes ago
Drawing of a toddler writing on the wall while her mother hides her eyes
AIOpenAI
AI agents that do your work while you sleep sound great. The reality is far messier—‘it’s like a toddler that needs to be overseen’
By Sharon GoldmanFebruary 23, 2026
2 hours ago
Photo of Dara Khosrowshahi
AIUber Technologies
Uber CEO predicts most rides could be robot operated within 20 years
By Jake AngeloFebruary 23, 2026
4 hours ago
broker
InvestingMarkets
Morgan Stanley hails rare ‘reindustrialization renaissance’ of AI economy—but it’s better for computers than humans
By Nick LichtenbergFebruary 23, 2026
5 hours ago
Photo: Inside a data center.
AIEconomics
Without AI spending, U.S. corporate investment in equipment would be negative, a decline that’s ‘worryingly broad-based,’ Pantheon analyst says 
By Jim EdwardsFebruary 23, 2026
5 hours ago
Photo of several people working on a presentation together
AICareers
Big Tech is shelling out up to $1 million for new hires who will never have to write a line of code
By Sydney LakeFebruary 23, 2026
5 hours ago

Most Popular

placeholder alt text
Innovation
The U.S. spent $30 billion to ditch textbooks for laptops and tablets: The result is the first generation less cognitively capable than their parents
By Sasha RogelbergFebruary 21, 2026
2 days ago
placeholder alt text
Economy
Scott Bessent has ’got a feeling’ that $175 billion raised under the IEEPA is lost to the American people for good
By Eleanor PringleFebruary 23, 2026
10 hours ago
placeholder alt text
Economy
A two-child household must earn $400,000 a year for childcare to be affordable, study says. 'It’s easy to see why birth rates are falling'
By Jason MaFebruary 22, 2026
1 day ago
placeholder alt text
Economy
Stocks sell off as traders wake up to the realization that Trump has 'highly punitive' options for new trade tariffs
By Jim EdwardsFebruary 23, 2026
11 hours ago
placeholder alt text
Startups & Venture
'I have a chip on my shoulder.' Phoebe Gates wants her $185 million AI startup Phia to succeed with 'no ties to my privilege or my last name'
By Sydney LakeFebruary 21, 2026
2 days ago
placeholder alt text
Economy
The Russian economy is eating its own muscle to survive as Putin’s war on Ukraine destroys future capacity, former central bank advisor says
By Jason MaFebruary 22, 2026
23 hours ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.