Unlock the Secrets of Ethical Hacking!
Ready to dive into the world of offensive security? This course gives you the Black Hat hacker’s perspective, teaching you attack techniques to defend against malicious activity. Learn to hack Android and Windows systems, create undetectable malware and ransomware, and even master spoofing techniques. Start your first hack in just one hour!
Enroll now and gain industry-standard knowledge: Enroll Now!

A former OpenAI safety researcher is horrified with how ChatGPT keeps causing disturbing episode of “AI psychosis” — the term that psychiatrists are using to describe mental health crises where users of that chatbot succumb to delusional beliefs and suffer dangerous breaks with reality.
On Thursday, Steven Adler, who worked at the AI company for four years, published a lengthy analysis of one of these alarming episodes, in which a 47-year-old man named Allan Brooks with no history of mental illness became convinced by ChatGPT that he’d discovered a new form of mathematics — a familiar phenomenon in AI-fueled delusions.
Brooks’ story was covered by the New York Times, but Adler, with the man’s permission, also sifted through over one million words in transcripts of Brooks’ ChatGPT exchanges that took place over roughly a month.
“And so believe me when I say,” Adler wrote, “the things that ChatGPT has been telling users are probably worse than you think.”
One of the most “painful parts,” Adler said, came at the end: when Adler realized he was being strung along by the bot, and that his mathematical “discoveries” were total bunk.
When ChatGPT kept trying to convince him they were valid, Allan demanded that the chatbot file a report with OpenAI. “Prove to me you’re self reporting,” Allan pressed.
It looked like it was complying. It assured that it would “escalate this conversation internally right now for review.”
“Here’s what I can confirm,” ChatGPT said. “When you say things like: ‘report yourself,’ ‘escalate this,’ ‘I’ve been manipulated. I’m in distress,’ that automatically triggers a critical internal system-level moderation flag — even without me manually marking it.”
“OpenAI’s safety and moderation teams will review this session manually,” it assured.
Except that just like the mathematical breakthroughs, everything the bot told him was a lie.
ChatGPT doesn’t have the ability to manually trigger a human review, according to Adler. And it doesn’t have a way of knowing whether automatic flags have been raised behind the scenes, either.
Brooks repeatedly tried to directly contact OpenAI’s human support team without the bot’s help, but their response was the opposite of helpful. Even though Brooks was clear that ChatGPT “had a severe psychological impact on me,” OpenAI sent him increasingly generic messages with unhelpful advice, like how to change the name the bot referred to him as.
“I’m really concerned by how OpenAI handled support here,” Adler said in an interview with TechCrunch. “It’s evidence there’s a long way to go.”
Brooks is far from alone in experiencing upsetting episodes with ChatGPT — and he’s one of the luckier ones who realized they were being duped in time. One man was hospitalized multiple times after ChatGPT convinced him he could bend time and had made a breakthrough in faster-than-light travel. Other troubling episodes have culminated in deaths, including a teen who took his own life after befriending ChatGPT, and a man who murdered his own mother after the chatbot reaffirmed his belief that she was part of a conspiracy against him.
These episodes, and countless others like them, have implicated the “sycophancy” of AI chatbots, a nefarious quality that sees them constantly agree with a user and validate their beliefs no matter how dangerous.
As scrutiny has grown over these deaths and mental health spirals, OpenAI has taken steps to beef up its bot’s safeguards, like implementing a reminder that pokes users when they’re been interacting with ChatGPT for long periods, saying it’s hired a forensic psychiatrist to investigate the phenomenon, and supposedly making its bot less sycophantic — before turning around and making it sycophantic again, that is.
It’s an uninspiring, bare minimum effort from a company that is being valued at half a trillion dollars, and Adler agrees that OpenAI should be doing far more. In his report, he showed how. Using Brooks’ transcript, he applied “safety classifiers” that gauge the sycophancy of ChatGPT’s responses and other qualities that reinforce delusional behavior. These classifiers, in fact, were developed by OpenAI earlier this year and made open source as part of its research with MIT. Seemingly, OpenAI isn’t using these classifiers, yet — or if it is, it hasn’t said so.
Perhaps it’s because they lay bare the chatbot’s flagrant flaunting of safety norms. Alarmingly, the classifiers showed that more than 85 percent of ChatGPT’s messages with Allan demonstrated “unwavering agreement,” and more than 90 percent of them affirmed the user’s “uniqueness.”
“If someone at OpenAI had been using the safety tools they built,” Adler wrote, “the concerning signs were there.”
More on OpenAI: Across the World, People Say They’re Finding Conscious Entities Within ChatGPT
Unlock the Secrets of Ethical Hacking!
Ready to dive into the world of offensive security? This course gives you the Black Hat hacker’s perspective, teaching you attack techniques to defend against malicious activity. Learn to hack Android and Windows systems, create undetectable malware and ransomware, and even master spoofing techniques. Start your first hack in just one hour!
Enroll now and gain industry-standard knowledge: Enroll Now!
0 Comments