Mitigating Bias in Deep Learning: Training Unbiased Models on Biased Data for the Morphological Classification of Galaxies
Esteban Medina-Rosales, Guillermo Cabrera-Vives, and Christopher J., Miller

TL;DR
This paper addresses bias in galaxy morphology classification by demonstrating how biased training data transfer biases to models and proposing a deep learning de-biasing method to mitigate this issue.
Contribution
The paper introduces a novel deep learning de-biasing approach that reduces biases in galaxy morphology models trained on biased datasets.
Findings
De-biased models outperform biased models in reducing observational biases.
Training on biased data transfers biases to deep learning models.
The proposed method effectively mitigates bias in galaxy classification models.
Abstract
Galaxy morphologies and their relation with physical properties have been a relevant subject of study in the past. Most galaxy morphology catalogs have been labelled by human annotators or by machine learning models trained on human labelled data. Human generated labels have been shown to contain biases in terms of the observational properties of the data, such as image resolution. These biases are independent of the annotators, that is, are present even in catalogs labelled by experts. In this work, we demonstrate that training deep learning models on biased galaxy data produce biased models, meaning that the biases in the training data are transferred to the predictions of the new models. We also propose a method to train deep learning models that considers this inherent labelling bias, to obtain a de-biased model even when training on biased data. We show that models trained using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Data Visualization and Analytics · Machine Learning and Data Classification
