AI-MASLD Metabolic Dysfunction and Information Steatosis of Large Language Models in Unstructured Clinical Narratives
Yuan Shen, Xiaojun Wu, Linghua Yu

TL;DR
This study systematically evaluates large language models' ability to extract medical information from noisy clinical narratives, revealing they exhibit functional decline similar to metabolic dysfunction, raising safety concerns for AI in healthcare.
Contribution
It introduces the concept of AI-MASLD, empirically demonstrating LLMs' limitations in clinical information extraction under real-world noisy conditions.
Findings
All models showed functional defects with noise.
Qwen3-Max performed best overall.
Models experienced collapse under extreme noise.
Abstract
This study aims to simulate real-world clinical scenarios to systematically evaluate the ability of Large Language Models (LLMs) to extract core medical information from patient chief complaints laden with noise and redundancy, and to verify whether they exhibit a functional decline analogous to Metabolic Dysfunction-Associated Steatotic Liver Disease (MASLD). We employed a cross-sectional analysis design based on standardized medical probes, selecting four mainstream LLMs as research subjects: GPT-4o, Gemini 2.5, DeepSeek 3.1, and Qwen3-Max. An evaluation system comprising twenty medical probes across five core dimensions was used to simulate a genuine clinical communication environment. All probes had gold-standard answers defined by clinical experts and were assessed via a double-blind, inverse rating scale by two independent clinicians. The results show that all tested models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Healthcare · Genomics and Rare Diseases
