Anatomy of a failure: When, how, and why deep vision fails in scientific domains
Ji-Hun Oh, Dou Hoon Kwark, Kianoush Falahkheirkhah, Kevin Yeh, John Cheville, Volodymyr Kindratenko, Rohit Bhargava

TL;DR
Deep learning often fails in scientific imaging due to modality-specific issues, especially with infrared data, leading to catastrophic prediction collapse despite rich information content.
Contribution
This paper reveals fundamental failure modes of deep learning in scientific imaging, highlighting the mismatch between data priors and model biases, and proposes a framework for modality-specific analysis.
Findings
DL models underperform on IR data despite its richness.
IR data priors interact poorly with DL's simplicity bias, causing collapse.
State-of-the-art robustification strategies do not prevent these failures.
Abstract
Mirroring its ubiquity in popular media and all human activities, the use of deep learning (DL) is rapidly growing in scientific imaging modalities. However, unlike everyday RGB pictures, pixels encode precise physicochemical properties in scientific imaging across potentially thousands of channels. While DL is well validated on human-centric RGB perceptual tasks, its effectiveness for scientific imaging remains uncertain. Here, we show that the naive application of DL frameworks to scientific images can lead to critical failures. We evaluate the use of DL for pathology, comparing RGB images of stained tissue with the quantitative and information-rich biochemical signatures of infrared (IR) imaging. Despite this informational advantage, DL models trained on IR data paradoxically underperform. We investigate this discrepancy to find that IR data priors interact poorly with the simplicity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
