Reliable Self-Harm Risk Screening via Adaptive Multi-Agent LLM Systems

Meghana Karnam; Ananya Joshi

arXiv:2604.22154·cs.LG·April 27, 2026

Reliable Self-Harm Risk Screening via Adaptive Multi-Agent LLM Systems

Meghana Karnam, Ananya Joshi

PDF

TL;DR

This paper introduces a statistical framework for multi-agent LLM pipelines in behavioral health that adaptively improves decision reliability and reduces false positives in self-harm risk screening.

Contribution

It develops a principled, adaptive decision-making approach with performance bounds and regret guarantees for multi-agent LLM systems in safety-critical applications.

Findings

01

Achieved the lowest false positive rate of 0.095 on AEGIS 2.0 dataset.

02

Reduced incorrect flagging of safe content by 40% compared to single-agent models.

03

Maintained similar false negative rates across all conditions.

Abstract

Emerging AI systems in behavioral health and psychiatry use multi-step or multi-agent LLM pipelines for tasks like assessing self-harm risk and screening for depression. However, common evaluation approaches, like LLM-as-a-judge, do not indicate when a decision is reliable or how errors may accumulate across multiple LLM judgements, limiting their suitability for safety-critical settings. We present a statistical framework for multi-agent pipelines structured as directed acyclic graphs (DAGs) that provides an alternative to heuristic voting with principled, adaptive decision-making. We model each agent as a stochastic categorical decision and introduce (1) tighter agent-level performance confidence bounds, (2) a bandit-based adaptive sampling strategy based on input difficulty, and (3) regret guarantees over the multi-agent system that shows logarithmic error growth when deployed. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.