# Automated Clinical Problem Detection from SOAP Notes using a Collaborative Multi-Agent LLM Architecture

**Authors:** Yeawon Lee, Xiaoyang Wang, Christopher C. Yang

arXiv: 2508.21803 · 2025-09-01

## TL;DR

This paper presents a multi-agent system modeled after a clinical team that improves the accuracy and robustness of detecting clinical problems from SOAP notes using LLMs, outperforming single-agent approaches.

## Contribution

It introduces a novel collaborative multi-agent architecture that simulates clinical team reasoning for problem detection in SOAP notes, enhancing performance and interpretability.

## Key findings

- Improved detection accuracy for CHF, AKI, and sepsis.
- Hierarchical debate among agents surfaces conflicting evidence.
- System outperforms single-agent baseline on curated dataset.

## Abstract

Accurate interpretation of clinical narratives is critical for patient care, but the complexity of these notes makes automation challenging. While Large Language Models (LLMs) show promise, single-model approaches can lack the robustness required for high-stakes clinical tasks. We introduce a collaborative multi-agent system (MAS) that models a clinical consultation team to address this gap. The system is tasked with identifying clinical problems by analyzing only the Subjective (S) and Objective (O) sections of SOAP notes, simulating the diagnostic reasoning process of synthesizing raw data into an assessment. A Manager agent orchestrates a dynamically assigned team of specialist agents who engage in a hierarchical, iterative debate to reach a consensus. We evaluated our MAS against a single-agent baseline on a curated dataset of 420 MIMIC-III notes. The dynamic multi-agent configuration demonstrated consistently improved performance in identifying congestive heart failure, acute kidney injury, and sepsis. Qualitative analysis of the agent debates reveals that this structure effectively surfaces and weighs conflicting evidence, though it can occasionally be susceptible to groupthink. By modeling a clinical team's reasoning process, our system offers a promising path toward more accurate, robust, and interpretable clinical decision support tools.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.21803/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/2508.21803/full.md

## References

12 references — full list in the complete paper: https://tomesphere.com/paper/2508.21803/full.md

---
Source: https://tomesphere.com/paper/2508.21803