NeuronTune: Towards Self-Guided Spurious Bias Mitigation

Guangtao Zheng; Wenqian Ye; Aidong Zhang

arXiv:2505.24048·cs.LG·June 2, 2025

NeuronTune: Towards Self-Guided Spurious Bias Mitigation

Guangtao Zheng, Wenqian Ye, Aidong Zhang

PDF

Open Access 1 Repo 1 Video

TL;DR

NeuronTune is a post hoc method that identifies and regulates neurons responsible for spurious biases in neural networks, improving robustness without needing external annotations of biases.

Contribution

It introduces a self-guided, post hoc approach to mitigate spurious bias by intervening in the model's internal neuron activations, without relying on external bias annotations.

Findings

01

Significantly reduces spurious bias across architectures.

02

Operates without external bias annotations.

03

Improves model robustness in various data modalities.

Abstract

Deep neural networks often develop spurious bias, reliance on correlations between non-essential features and classes for predictions. For example, a model may identify objects based on frequently co-occurring backgrounds rather than intrinsic features, resulting in degraded performance on data lacking these correlations. Existing mitigation approaches typically depend on external annotations of spurious correlations, which may be difficult to obtain and are not relevant to the spurious bias in a model. In this paper, we take a step towards self-guided mitigation of spurious bias by proposing NeuronTune, a post hoc method that directly intervenes in a model's internal decision process. Our method probes in a model's latent embedding space to identify and regulate neurons that lead to spurious prediction behaviors. We theoretically justify our approach and show that it brings the model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gtzheng/neurontune
pytorchOfficial

Videos

NeuronTune: Towards Self-Guided Spurious Bias Mitigation· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning

MethodsHigh-Order Consensuses