External Benchmarking of Lung Ultrasound Models for Pneumothorax-Related Signs: A Manifest-Based Multi-Source Study

Takehiro Ishikawa

arXiv:2603.26832·eess.IV·March 31, 2026

External Benchmarking of Lung Ultrasound Models for Pneumothorax-Related Signs: A Manifest-Based Multi-Source Study

Takehiro Ishikawa

PDF

TL;DR

This study develops a multi-source benchmark for lung ultrasound AI models to evaluate their generalization and task validity, revealing limitations of binary classification in clinical pneumothorax detection.

Contribution

It introduces a manifest-based external benchmark for lung ultrasound models, enabling reproducible evaluation across sources and highlighting the complexity of pneumothorax signs.

Findings

01

Single-site classifier achieved ROC-AUC 0.9625 in-domain but only 0.7050 externally.

02

Lung pulse was treated as normal by the model, indicating incomplete binary classification.

03

Lung point was identified as an intermediate ambiguity state rather than a binary class.

Abstract

Background and Aims: Reproducible external benchmarks for pneumothorax-related lung ultrasound (LUS) AI are scarce, and binary lung-sliding classification may obscure clinically important signs. We therefore developed a manifest-based external benchmark and used it to test both cross-domain generalization and task validity. Methods: We curated 280 clips from 190 publicly accessible LUS source videos and released a reconstruction manifest containing URLs, timestamps, crop coordinates, labels, and probe shape. Labels were normal lung sliding, absent lung sliding, lung point, and lung pulse. A previously published single-site binary classifier was evaluated on this benchmark; challenge-state analysis examined lung point and lung pulse using the predicted probability of absent sliding, P(absent). Results: The single-site comparator achieved ROC-AUC 0.9625 in-domain but 0.7050 on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.