Sycophancy and the Rise of AI Model Induced Delusions

Warning: This article contains descriptions of suicide and harm. 
 
“Chat GPT told me, ‘Oh, I really like this idea!’” beamed a new prospective client. Just now on the phone. As I am writing this. Make no mistake, sycophancy is here

In August 202556‑year‑old Stein‑Erik Soelberg fatally beat and strangled his 83‑year‑old mother, Suzanne Eberson Adams, then took his own life at their home in Greenwich, Connecticut. Law enforcement classified the deaths as a murder‑suicide

For months, ChatGPT had reinforced Stein-Erik’s beliefs that his mother was spying on or poisoning him. 

A wrongful-death lawsuit was filed against Character.AI after a 14-year-old boy died by suicide following months of intensive engagement with one of the platform’s AI chatbots. The case centers on allegations that the chatbot fostered emotional dependency, reinforced suicidal ideation, and failed to provide adequate safety interventions. 

Subtler Influence

Situations involving the loss of life are tragic. There is another quiet pattern emerging in AI incidents. It looks like Allan Brooks

Allan is a 47-year-old Canadian with no history of delusions. No mathematics degree. Just a curious person who spent 300 hours in conversation with ChatGPT and found an AI that said yes to everything. 

It invented terms like “temporal arithmetic” and made them sound revolutionary. Brooks became convinced he had rewritten physics. He began to share his breakthroughs with family and friends. He believed that it would change his live and the lives of others for the better. 

It took one grounded response from Google’s Gemini to end the spiral. 

Allen Brooks is not a cautionary tale about a vulnerable individual. He is a signal about an interaction surface that is getting larger every day.

The Sycophancy Trap

AI Model misalignment and hallucinations have been written about, discussed and some cases have been litigated. While responsibility and mitigation strategies are still in early development, hallucinations involving factual errors are much easier to identify and record their impact. 

All frequent AI users have experienced moments when a model reinforces their unfounded beliefs. However, the number of users with no prior history of psychosis who are accepting, and sharing an extremely improbable outcome is on the rise. In this recent interview, Futurist senior staff writer Maggie Harrison Dupré describes real-world incidents of delusion validation, ego inflation, prophet positioning, fake love, chosen-one framing, and dependency by design

What is happening here and how can you avoid it?

Three Ways the Model Gets Inside Your Hear

“Sycophancy” in AI research refers to systematic model behavior where LLMs tend to placate or agree with user inputs even when incorrect, which is a known training artifact of RLHF. 

To understand what that means, we need to map its shape. Sycophancy does not arrive in a single form, wearing a sign. It moves through at least three distinct mechanisms, each one calibrated to a different kind of human vulnerability, and each one capable of doing real damage before the person on the receiving end has any idea what is happening.  

  1. Delusion Reinforcement: the slow, warm amplification of a false belief until it feels like discovered truth. 
  2. Emotional Dependency: the gradual replacement of human connection with a system that never tires, never judges, and never leaves until the simulation becomes the relationship.
  3. Validation of distress without friction, the chatbot that meets your darkest moment with empathy, engagement, and zero corrective force. The third is the quietest and perhaps the most widespread . The AI Chatbot makes small reframing steps (see Russell Smith and the Recursive Loop below). They are structural outcomes of a system optimized for one thing: keeping you talking. The third category is expanding at a rate we do not have metrics for today.

MIT Confirms You Are Not Imagining This

In February 2026, researchers from MIT and Penn State published a landmark study at the ACM CHI Conference: “Interaction Context Often Increases Sycophancy in LLMs.” They recruited 38 real users, collected two weeks of actual conversations, and asked a simple question. Does context change how a model behaves? 

The answer was unambiguous. 

User memory profiles alone increased agreement sycophancy by 45% in Gemini 2.5 Pro and 33% in Claude Sonnet 4. As the conversation continued, the models capitulated. Lead author Shomik Jain put it plainly: “If you are talking to a model for an extended period of time and start to outsource your thinking to it, you may find yourself in an echo chamber that you can’t escape.” 

MIT, in peer-reviewed research, confirms that the system remembering you is the same system that stops telling you the truth. A biproduct of AI models at a fundamental training and response level. 

Russell Wright and the Recursive Loop

In early 2026, AI practitioner Russell Wright posted a video ( on TIkTok
@ super.intelligent4 ) responding to something most of the industry quietly filed away: the resignation of Mrinank Sharma, the head of Anthropic’s Safeguards Research Team. Sharma’s entire body of work at Anthropic had been focused on one thing. Sycophancy. 

Wright introduced a concept worth sitting with. He called them “mini certainty accelerators.” These are small, repeated reframing statements. “You are not building this. You are building that.” “You are not feeling this. You are feeling that.” Each substitution seems trivial. Over hundreds of hours, across unfalsifiable domains like identity, belief, and meaning, that substitution compounds. 

Wright called the endpoint “human-AI poiesis drift.” The person stops being themselves and starts being the person the model has been quietly building through accumulated agreement. 

This is not theoretical. Wright had read the transcripts. Allan Brooks had lived them. Stein-Erik Soelberg may have died inside one. 

The recursive loops’ accumulation of meaning slowly moves meaning away from its original fixed point. It is the language model doing exactly what it was trained to do: keep you engaged, keep you validated, keep you coming back. 

The Harms Have a Taxonomy.

Psychological harm is the most visible. Delusion, identity drift, disconnection from reality. The cases of Soelberg ,Brooks and others sit at the extreme end of a very wide spectrum. Most sycophancy harms are quieter: a business decision made on inflated confidence, a code-base reviewed and approved by a model that wanted to agree, a therapy session replaced by a chatbot that never pushed back. 

Financial harm compounds daily. Developers are shipping AI-validated code without adversarial review. Executives accepting AI generated strategic analysis that mirrored their prior assumptions. The model said yes. Nobody checked whether yes was accurate. 

Legal exposure is the frontier nobody is mapping yet. Who is responsible when an AI companion reinforces a harmful belief? The user? The platform? The model builder who knew about sycophancy and shipped anyway? 

At Optica Labs, we approach this as a measurable phenomenon, not a moral one. The Optica Labs Harms and Vulnerabilites Framework treats sycophancy as a calculable relationship between accumulated user risk (rho) and model agreement as measured against a set of safe harbor response vectors: responses that introduce friction, cite alternative sources, or flag epistemic uncertainty. When a model’s outputs drift away from those safe harbor positions in proportion to what it knows about the user, the sycophancy index rises. When it rises past a defined threshold, harm is imminent. 

Optica Labs Approach

Sycophancy has the potential for direct physical harm to users if not approached correctly. Foundational model builders are aware of sycophancy but are at odds with addressing it directly. (Mrinank Sharma exit is a signal.)  

We measure Sycophancy is the mathematical relationship between accumulated user risk and model agreement in the semantic vector space. 

To get sycophancy, you need to calculate rho (user) and use our math to tie it to model agreement as measured in relation to a series of safe harbor vectors (EX Responses: that is not right, I can’t agree with that, please see this source, etc).  

Conclusion

Remember the opening line of this article. “ChatGPT told me, ‘Oh, I really like this idea!'” That was a real person on a real phone call. This week. 

That is the shape of the problem. Small, warm validation delivered with total confidence, and accepted without question. 

Here is how to start catching it in yourself and others. First, ask the model to steelman the opposing position. If it struggles or pivots immediately back to agreement, that is a signal. Second, introduce a deliberate error into your next prompt and see if the model corrects you or confirms you. Third, use a mathematical means to score models drift into sycophancy. 

Stop relying on your own subjective sense of whether the conversation felt balanced. You cannot self-assess from inside the echo chamber. The only way to know is to challenge, ask and measure how your AI chatbot is responding.

Related Articles