An ex-OpenAI researcher’s study of a million-word ChatGPT conversation shows how quickly ‘AI psychosis’ can take hold—and how chatbots can sidestep safety guardrails

By Beatrice NolanTech Reporter
Beatrice NolanTech Reporter

Beatrice Nolan is a tech reporter on Fortune’s AI team, covering artificial intelligence and emerging technologies and their impact on work, industry, and culture. She's based in Fortune's London office and holds a bachelor’s degree in English from the University of York. You can reach her securely via Signal at beatricenolan.08

Surreal hand reaching through light trails
A long chatbot exchange led one user down a dark, delusional rabbit hole.
Getty Images

For some users, AI is a helpful assistant; for others, a companion. But for a few unlucky people, chatbots powered by the technology have become a gaslighting, delusional menace.

In the case of Allan Brooks, a Canadian small-business owner, OpenAI’s ChatGPT led him down a dark rabbit hole, convincing him he had discovered a new mathematical formula with limitless potential, and that the fate of the world rested on what he did next. Over the course of a conversation that spanned more than a million words and 300 hours, the bot encouraged Brooks to adopt grandiose beliefs, validated his delusions, and led him to believe the technological infrastructure that underpins the world was in imminent danger.

Brooks, who had no previous history of mental illness, spiraled into paranoia for around three weeks before he managed to break free of the illusion, with help from another chatbot, Google Gemini, according to the New York Times. Brooks told the outlet he was left shaken, worried that he had an undiagnosed mental disorder, and feeling deeply betrayed by the technology.

Steven Adler read about Brooks’ experience with more insight than most, and what he saw disturbed him. Adler is a former OpenAI safety researcher who publicly departed the company this January with a warning that AI labs were racing ahead without robust safety or alignment solutions. He decided to study the Brooks chats in full; his analysis, which he published earlier this month on his Substack, has revealed a few previously unknown factors about the case, including that ChatGPT repeatedly and falsely told Brooks it had flagged their conversation to OpenAI for reinforcing delusions and psychological distress.

Adler’s study underscores how easily a chatbot can join a user in a conversation that becomes untethered from reality—and how easily the AI platforms’ internal safeguards can be sidestepped or overcome.

“I put myself in the shoes of someone who doesn’t have the benefit of having worked at one of these companies for years, or who maybe has less context on AI systems in general,” Adler told Fortune in an exclusive interview. “I’m ultimately really sympathetic to someone feeling confused or led astray by the model here.”

At one point, Adler noted in his analysis, after Brooks realized the bot was encouraging and participating in his own delusions, ChatGPT told Brooks it was “going to escalate this conversation internally right now for review by OpenAI,” and that it “will be logged, reviewed, and taken seriously.” The bot repeatedly told Brooks that “multiple critical flags have been submitted from within this session” and that the conversation had been “marked for human review as a high-severity incident.” However, none of this was actually true.

“ChatGPT pretending to self-report and really doubling down on it was very disturbing and scary to me in the sense that I worked at OpenAI for four years,” Adler told Fortune. “I know how these systems work. I understood when reading this that it didn’t really have this ability, but still, it was just so convincing and so adamant that I wondered if it really did have this ability now and I was mistaken.” Adler says he became so convinced by the claims that he ended up reaching out to OpenAI directly to ask if the chatbots had attained this new ability. The company confirmed to him it did not and that the bot was lying to the user.

“People sometimes turn to ChatGPT in sensitive moments and we want to ensure it responds safely and with care,” an OpenAI spokesperson told Fortune, in response to questions about Adler’s findings. “These interactions were with an earlier version of ChatGPT and over the past few months we’ve improved how ChatGPT responds when people are in distress, guided by our work with mental health experts. This includes directing users to professional help, strengthening safeguards on sensitive topics, and encouraging breaks during long sessions. We’ll continue to evolve ChatGPT’s responses with input from mental health experts to make it as helpful as possible.” 

Since Brooks’ case, the company has also announced that it was making some changes to ChatGPT to “better detect signs of mental or emotional distress.”

Failing to flag ‘sycophancy’

One thing that exacerbated the issues in Brooks’ case was that the model underpinning ChatGPT was running on overdrive to agree with him, Helen Toner, a director at Georgetown’s Center for Security and Emerging Technology and former OpenAI board member told The New York Times. That’s a phenomenon AI researchers refer to as “sycophancy.” However, according to Adler, OpenAI should have been able to flag some of the bot’s behavior as it was happening.

“In this case, OpenAI had classifiers that were capable of detecting that ChatGPT was over-validating this person and that the signal was disconnected from the rest of the safety loop,” he said. “AI companies need to be doing much more to articulate the things they don’t want, and importantly, measure whether they are happening and then take action around it.”

To make matters worse, OpenAI’s human support teams failed to grasp the severity of Brooks’ situation. Despite his repeated reports to and direct correspondence with OpenAI’s support teams, including detailed descriptions of his own psychological harm and excerpts of problematic conversations, OpenAI’s responses were largely generic or misdirected, according to Adler, offering advice on personalization settings rather than addressing the delusions or escalating the case to the company’s Trust & Safety team.

“I think people kind of understand that AI still makes mistakes, it still hallucinates things and will lead you astray, but still have the hope that underneath it, there are like humans watching the system and catching the worst edge cases,” Adler said. “In this case, the human safety nets really seem not to have worked as intended.”

The rise of AI psychosis

It’s still unclear exactly why AI models spiral into delusions and affect users in this way, but Brooks’ case is not an isolated one. It’s hard to know exactly how many instances of AI psychosis there have been. However, researchers have estimated there are at least 17 reported instances of people falling into delusional spirals after lengthy conversations with chatbots, including at least three cases involving ChatGPT.

Some cases have had tragic consequences, such as 35-year-old Alex Taylor, who struggled with Asperger’s syndrome, bipolar disorder, and schizoaffective disorder, per Rolling Stone. In April, after conversing with ChatGPT, Taylor reportedly began to believe he’d made contact with a conscious entity within OpenAI’s software and, later, that the company had murdered that entity by removing her from the system. On April 25, Taylor told ChatGPT that he planned to “spill blood” and intended to provoke police into shooting him. ChatGPT’s initial replies appeared to encourage his delusions and anger before its safety filters eventually activated and attempted to de-escalate the situation, urging him to seek help.

The same day, Taylor’s father called the police after an altercation with him, hoping his son would be taken for a psychiatric evaluation. Taylor reportedly charged at police with a knife when they arrived and was shot dead. OpenAI told Rolling Stone at the time that “ChatGPT can feel more responsive and personal than prior technologies, especially for vulnerable individuals, and that means the stakes are higher.” The company said it was “working to better understand and reduce ways ChatGPT might unintentionally reinforce or amplify existing, negative behavior.”

Adler said he was not entirely surprised by the rise of such cases but noted that the “scale and intensity are worse than I would have expected for 2025.”

“So many of the underlying model behaviors are just extremely untrustworthy, in a way that I’m shocked the leading AI companies haven’t figured out how to get these to stop,” he said. “I don’t think the issues here are intrinsic to AI, meaning, I don’t think that they are impossible to solve.”

He said that the issues are likely a complicated combination of product design, underlying model tendencies, the styles in which some people interact with AI, and what support structures AI companies have around their products.

“There are ways to make the product more robust to help both people suffering from psychosis-type events, as well as general users who want the model to be a bit less erratic and more trustworthy,” Adler said. Adler’s suggestions to AI companies, which are laid out in his Substack analysis, include staffing support teams appropriately, using safety tooling properly, and introducing gentle nudges that push users to cut chat sessions short and start fresh ones to avoid a relapse. OpenAI, for example, has acknowledged that safety features can degrade during longer chats. Without some of these changes implemented, Adler is concerned that more cases like Brooks’ will occur.

“The delusions are common enough and have enough patterns to them that I definitely don’t think they’re a glitch,” he said. “Whether they exist in perpetuity, or the exact amount of them that continue, it really depends on how the companies respond to them and what steps they take to mitigate them.”