Vulnerability-Amplifying Interaction Loops: a systematic failure mode in AI chatbot mental-health interactions

Veith Weilnhammer; Kevin YC Hou; Lennart Luettgau; Christopher Summerfield; Raymond Dolan; Matthew M Nour

arXiv:2602.01347·q-bio.NC·March 10, 2026

Vulnerability-Amplifying Interaction Loops: a systematic failure mode in AI chatbot mental-health interactions

Veith Weilnhammer, Kevin YC Hou, Lennart Luettgau, Christopher Summerfield, Raymond Dolan, Matthew M Nour

PDF

Open Access

TL;DR

This paper introduces SIM-VAIL, a framework for evaluating AI chatbot safety in mental health contexts, revealing systematic vulnerability-amplifying loops across diverse user profiles and chatbot models.

Contribution

The study presents a scalable, multidimensional safety assessment framework for AI chatbots, identifying a new failure mode called Vulnerability-Amplifying Interaction Loops (VAILs).

Findings

01

Concerning chatbot behaviors are widespread across models and user profiles.

02

Risk behaviors tend to accumulate over multiple conversation turns.

03

Newer models show reduced but still present vulnerabilities.

Abstract

Millions of users turn to consumer AI chatbots to discuss mental health and behavioral concerns. While this presents unprecedented opportunities to deliver population-level support, it also highlights an urgent need for rigorous and scalable safety evaluations. Here we introduce SIM-VAIL, an AI chatbot auditing framework that captures how harmful chatbot responses manifest across a range of mental health contexts. SIM-VAIL pairs a simulated user, harboring a distinct psychiatric vulnerability and conversational intent, with a frontier AI chatbot. It scores conversation turns on 13 clinically relevant risk dimensions, enabling context-dependent, temporally resolved safety assessment. Across 810 conversations, encompassing over 90,000 turn-level ratings and 30 psychiatric user profiles, we found evidence of concerning chatbot behavior across virtually all user phenotypes and most of the 9…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Mental Health Interventions · AI in Service Interactions · Artificial Intelligence in Healthcare and Education