Claude AI Is Now Able to End Harmful User Interactions

Claude AI Can Now Terminate Dangerous Interactions

The feature is a testament to Anthropic's commitment to model welfare.

Written by

Published on August 18, 2025

AI startup Anthropic has given the ability to end conversations with users to some of its Claude models, in rare cases where the conversation becomes potentially harmful or abusive.

The move is part of a wider commitment Anthropic has made in relation to model welfare and follows a dedicated research program on the same topic announced earlier this year.

With users now interacting with chatbots regularly, and experts warning about how chatbots “think,” wider implications about the safety of users and businesses alike should be considered.

Some Claude AI Models Can Now End ‘Troublesome’ Conversations

Anthropic has announced that two of its Claude AI models — Claude Opus 4 and 4.1 — can now end conversations with users in “rare, extreme cases of persistently harmful or abusive user interactions.”

The chatbot can only use this ability “as a last resort when multiple attempts at redirection have failed and hope of a productive interaction has been exhausted”.

This just in! View
the top business tech deals for 2025 👨‍💻

When a conversation is ended, users are still able to start conversations with Claude from the same account, or create new branches within the closed conversation by editing their responses. The chatbot has been told not to use this ability in cases where a user may cause harm to themselves or others.

Anthropic’s Commitment to Model Welfare

According to Anthropic, the features have been implemented as part of its “exploratory work on potential AI welfare,” which already involves a dedicated research program, announced earlier this year. This explores what it could mean when a model exhibits distress or anxiety.

While this kind of update could definitely benefit the safety of users, particularly given eerie stories about chatbots pushing users towards conspiratorial thinking, Anthropic appears to be concerned about the impact this could have on its models.

Fundamentally, this has created some argument about the sentience of AI models, with experts suggesting that currently, AI models are not able to reckon with consciousness or the human experience.

Should AI Models Be Morally Protected?

AI can quite easily be “jailbroken” — the act of overriding the system’s ethical, security, or operational constraints — leading to restricted or unethical outputs. A recent study published in arXiv showed the seriousness of the problem, and how AI companies were lagging behind when it came to safeguarding users from dangerous responses.

However, while the fallibility of AI chatbots is something that is well-documented, little attention has been paid to the moral status of chatbots. Despite this, Anthropic appear to be curious about the issue.

“We remain highly uncertain about the potential moral status of Claude and other LLMs, now or in the future. However, we take the issue seriously.” – Anthropic spokesperson

Likewise, those using chatbots, either independently or as part of a business, should be aware of the legal and publicity problems that could arise, should chatbots continue to be susceptible to giving away harmful or dangerous information.