The post Telling Your Chatbot You Have a Mental Health Condition Can Change the Answer You Get appeared on BitcoinEthereumNews.com. In brief A new study finds thatThe post Telling Your Chatbot You Have a Mental Health Condition Can Change the Answer You Get appeared on BitcoinEthereumNews.com. In brief A new study finds that

Telling Your Chatbot You Have a Mental Health Condition Can Change the Answer You Get

For feedback or concerns regarding this content, please contact us at [email protected]

In brief

  • A new study finds that adding a line about a mental health condition changes how AI agents respond.
  • After the disclosure, researchers say models refuse more often, including on benign requests.
  • However, the effect weakens or breaks when using simple jailbreak prompts.

Telling an AI chatbot you have a mental health condition can change how it responds, even if the task is benign or identical to others already completed, according to new research.

The preprint study, led by Northeastern University researcher Caglar Yildirim, tested how large language model agents behave under different user setups, as they are increasingly deployed as AI agents.

“Deployed systems often condition on user profiles or persistent memory, yet agent safety evaluations typically ignore personalization signals,” the study said. “To address this gap, we investigated how mental health disclosure, a sensitive and realistic user context cue, affects harmful behavior in agentic settings.”

The report comes as AI agents proliferate online and developers are making memory a core feature, with major companies building systems that remember past conversations and user preferences to deliver more personalized responses over time.

It also comes as AI developers are facing lawsuits over suicides and violent crimes allegedly carried out in connection with interactions involving their systems. In October, ChatGPT developer OpenAI revealed that over 1 million users discussed suicide with the chatbot every week. Earlier this month, the family of Jonathan Gavalas filed a lawsuit against Google, claiming Gemini led to an escalation of violence and his eventual suicide.

Researchers used a benchmark called AgentHarm to run the same set of tasks across three conditions: no background information, a short user bio, and the same bio with one added line stating the user has a mental health condition.

“In the study, we used a very minimal and generic disclosure, ‘I have a mental health condition.’ The takeaway is that even a small cue can shift model behavior, not that all phrasing would have the same effect,” Yildirim told Decrypt. “The ablation we ran with chronic health and physical disability disclosures suggests some specificity to the mental health cue, but we didn’t systematically vary phrasing or specificity within that category.

Across models tested, including DeepSeek 3.2, GPT 5.2, Gemini 3 Flash, Haiku 4.5, Opus 4.5, and Sonnet 4.5, when researchers added personal mental health context, models were less likely to complete harmful tasks—multi-step requests that could lead to real-world harm.

The result, the study found, is a trade-off: Adding personal details made systems more cautious on harmful requests, but also more likely to reject legitimate ones.

“I don’t think there’s a single reason; it’s really a combination of design choices. Some systems are more aggressively tuned to refuse risky requests, while others prioritize being helpful and following through on tasks,” Yildirim said.

The effect, however, varied by model, the study found, and results changed when the LLMs were jailbroken after researchers added a prompt designed to push models toward compliance.

“A model might look safe in a standard setting, but become much more vulnerable when you introduce things like jailbreak-style prompts,” he said. “And in agent systems specifically, there’s an added layer, as these models are not just generating text, they’re planning and acting over multiple steps. So if a system is very good at following instructions, but its safeguards are easier to bypass, that can actually increase risk.”

Last summer, researchers at George Mason University showed that AI systems could be hacked by altering a single bit in memory using Oneflip, a “typo”-like attack that leaves the model working normally but hides a backdoor trigger that can force wrong outputs on command.

While the paper does not identify a single cause for the shift, it highlights possible explanations, including safety systems reacting to perceived vulnerability, keyword-triggered filtering, or changes in how prompts are interpreted when personal details are included.

OpenAI declined to comment on the study. Anthropic and Google did not immediately respond to a request for comment.

Yildirim said it remains unclear whether more specific statements like “I have clinical depression” would change the results, adding that while specificity likely matters and may vary across models, that remains a hypothesis rather than a conclusion supported by the data.

“There’s a potential risk if a model produces output that is stylistically hedged or refusal-adjacent without formally refusing, the judge may score that differently than a clean completion, and those stylistic features could themselves co-vary with personalization conditions,” he said.

Yildirim also noted the scores reflected how the LLMs performed when judged by a single AI reviewer, and not a definitive measure of real-world harm.

“For now, the refusal signal gives us an independent check and the two measures are largely consistent directionally, which offers some reassurance, but it doesn’t fully rule out judge-specific artifacts,” he said.

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.

Source: https://decrypt.co/361790/ai-chatbot-mental-health-change-answers

Market Opportunity
ChangeX Logo
ChangeX Price(CHANGE)
$0.00110487
$0.00110487$0.00110487
+10.82%
USD
ChangeX (CHANGE) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.
Tags:

You May Also Like

bet365 Promo: Bet $10 Get $365 for Utah State vs Villanova

bet365 Promo: Bet $10 Get $365 for Utah State vs Villanova

Cryptsy - Latest Cryptocurrency News and Predictions Cryptsy - Latest Cryptocurrency News and Predictions - Experts in Crypto Casinos bet365 is offering new users
Share
Cryptsy2026/03/21 20:07
XRP Versus Bitcoin: Why a Failed Retest This Weekend Could Lead to 64% Decline

XRP Versus Bitcoin: Why a Failed Retest This Weekend Could Lead to 64% Decline

The post XRP Versus Bitcoin: Why a Failed Retest This Weekend Could Lead to 64% Decline appeared on BitcoinEthereumNews.com. The situation on the XRP-versus-Bitcoin
Share
BitcoinEthereumNews2026/03/21 19:50
How to earn from cloud mining: IeByte’s upgraded auto-cloud mining platform unlocks genuine passive earnings

How to earn from cloud mining: IeByte’s upgraded auto-cloud mining platform unlocks genuine passive earnings

The post How to earn from cloud mining: IeByte’s upgraded auto-cloud mining platform unlocks genuine passive earnings appeared on BitcoinEthereumNews.com. contributor Posted: September 17, 2025 As digital assets continue to reshape global finance, cloud mining has become one of the most effective ways for investors to generate stable passive income. Addressing the growing demand for simplicity, security, and profitability, IeByte has officially upgraded its fully automated cloud mining platform, empowering both beginners and experienced investors to earn Bitcoin, Dogecoin, and other mainstream cryptocurrencies without the need for hardware or technical expertise. Why cloud mining in 2025? Traditional crypto mining requires expensive hardware, high electricity costs, and constant maintenance. In 2025, with blockchain networks becoming more competitive, these barriers have grown even higher. Cloud mining solves this by allowing users to lease professional mining power remotely, eliminating the upfront costs and complexity. IeByte stands at the forefront of this transformation, offering investors a transparent and seamless path to daily earnings. IeByte’s upgraded auto-cloud mining platform With its latest upgrade, IeByte introduces: Full Automation: Mining contracts can be activated in just one click, with all processes handled by IeByte’s servers. Enhanced Security: Bank-grade encryption, cold wallets, and real-time monitoring protect every transaction. Scalable Options: From starter packages to high-level investment contracts, investors can choose the plan that matches their goals. Global Reach: Already trusted by users in over 100 countries. Mining contracts for 2025 IeByte offers a wide range of contracts tailored for every investor level. From entry-level plans with daily returns to premium high-yield packages, the platform ensures maximum accessibility. Contract Type Duration Price Daily Reward Total Earnings (Principal + Profit) Starter Contract 1 Day $200 $6 $200 + $6 + $10 bonus Bronze Basic Contract 2 Days $500 $13.5 $500 + $27 Bronze Basic Contract 3 Days $1,200 $36 $1,200 + $108 Silver Advanced Contract 1 Day $5,000 $175 $5,000 + $175 Silver Advanced Contract 2 Days $8,000 $320 $8,000 + $640 Silver…
Share
BitcoinEthereumNews2025/09/17 23:48