Towards Robust Classification Model by Counterfactual and Invariant Data Generation
Chun-Hao Chang, George Alexandru Adam, Anna Goldenberg

TL;DR
This paper introduces two data generation techniques for image classification that focus on creating counterfactual and invariant images to improve model robustness against spurious correlations, leading to better generalization and interpretability.
Contribution
It proposes novel data generation processes based on causal feature annotations to reduce spuriousness and enhance model robustness in image classification tasks.
Findings
Outperforms state-of-the-art methods in accuracy when spurious correlations break.
Increases saliency focus on causal features for better explanations.
Improves model robustness across several challenging datasets.
Abstract
Despite the success of machine learning applications in science, industry, and society in general, many approaches are known to be non-robust, often relying on spurious correlations to make predictions. Spuriousness occurs when some features correlate with labels but are not causal; relying on such features prevents models from generalizing to unseen environments where such correlations break. In this work, we focus on image classification and propose two data generation processes to reduce spuriousness. Given human annotations of the subset of the features responsible (causal) for the labels (e.g. bounding boxes), we modify this causal set to generate a surrogate image that no longer has the same label (i.e. a counterfactual image). We also alter non-causal features to generate images still recognized as the original labels, which helps to learn a model invariant to these features. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Cell Image Analysis Techniques
