Whole-body CT attenuation and volume charts from routine clinical scans via evidence-grounded LLM report filtering
Christian Wachinger, Bernhard Renger, Christopher Sp\"ath, Jan Kirschke, Marcus Makowski

TL;DR
This study develops an LLM-based filtering system to create healthy reference distributions from large clinical CT datasets, enabling standardized quantitative analysis of organ volumes and tissue attenuation.
Contribution
It introduces an ensemble of LLMs for pathology filtering and constructs comprehensive, covariate-adjusted reference charts for 106 anatomical structures from routine CT scans.
Findings
Successfully filtered pathological findings from over 350,000 CT exams.
Established reference charts accounting for age, sex, contrast, and acquisition parameters.
Revealed structure- and contrast-dependent longitudinal changes.
Abstract
Interpreting quantitative CT biomarkers, such as organ volume and tissue attenuation, requires large-scale healthy reference distributions. However, creating these is challenging because clinical datasets are often heavily enriched with pathology. Here, we develop an evidence-grounded, cross-verified large language model (LLM) ensemble to filter pathological findings from radiology reports, enabling the construction of pathology-reduced cohorts from over 350,000 CT examinations. Five LLMs, first, flag structure-level abnormality candidates grounded in verbatim report evidence and, second, resolve disagreements via cross-verification. Using distribution-aware generalized additive models for location, scale, and shape, we establish comprehensive whole-body reference charts for 106 anatomical structures (volumes and attenuation) across adulthood, accounting for age, sex, contrast…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
