The Impact of Preprocessing Methods on Racial Encoding and Model Robustness in CXR Diagnosis

Dishantkumar Sutariya; Eike Petersen

arXiv:2603.05157·cs.CV·March 6, 2026

The Impact of Preprocessing Methods on Racial Encoding and Model Robustness in CXR Diagnosis

Dishantkumar Sutariya, Eike Petersen

PDF

Open Access

TL;DR

This study examines how different image preprocessing techniques affect racial bias and model accuracy in chest X-ray diagnosis, finding that lung cropping can reduce racial shortcut learning without sacrificing diagnostic performance.

Contribution

It demonstrates that simple lung cropping preprocessing can mitigate racial biases in CXR models while maintaining diagnostic accuracy, addressing fairness concerns.

Findings

01

Lung cropping reduces racial shortcut learning.

02

Preprocessing methods can influence model bias.

03

Diagnostic accuracy remains stable with lung cropping.

Abstract

Deep learning models can identify racial identity with high accuracy from chest X-ray (CXR) recordings. Thus, there is widespread concern about the potential for racial shortcut learning, where a model inadvertently learns to systematically bias its diagnostic predictions as a function of racial identity. Such racial biases threaten healthcare equity and model reliability, as models may systematically misdiagnose certain demographic groups. Since racial shortcuts are diffuse - non-localized and distributed throughout the whole CXR recording - image preprocessing methods may influence racial shortcut learning, yet the potential of such methods for reducing biases remains underexplored. Here, we investigate the effects of image preprocessing methods including lung masking, lung cropping, and Contrast Limited Adaptive Histogram Equalization (CLAHE). These approaches aim to suppress…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Radiomics and Machine Learning in Medical Imaging · Domain Adaptation and Few-Shot Learning