Mitigating Spurious Correlations in NLI via LLM-Synthesized Counterfactuals and Dynamic Balanced Sampling
Christopher Rom\'an Jaimes

TL;DR
This paper presents a scalable approach to reduce spurious correlations in NLI models by detecting artifacts, generating synthetic contrast sets with LLMs, and employing dynamic sampling to improve model robustness and accuracy.
Contribution
It introduces LF-LMI for artifact detection, an LLM-based synthetic contrast set generation pipeline, and a dynamic sampling strategy to mitigate forgetting during training.
Findings
Improved consistency from 63.5% to 81.0% on a challenging benchmark.
Maintained 88.4% in-domain accuracy.
Significantly outperformed naive fine-tuning.
Abstract
Natural Language Inference (NLI) models frequently rely on spurious correlations rather than semantic reasoning. Existing mitigation strategies often incur high annotation costs or trigger catastrophic forgetting during fine-tuning. We propose an automated, scalable pipeline to address these limitations. First, we introduce Log-Frequency LMI (LF-LMI) to accurately detect semantic artifacts. Second, we generate a high-quality synthetic contrast set via an LLM-synthesis pipeline with multi-judge verification. Finally, we introduce Dynamic Balanced Sampling, a training strategy that rotates the original data distribution to prevent forgetting. Our method improves consistency on a challenging benchmark from 63.5% to 81.0% while maintaining 88.4% in-domain accuracy, significantly outperforming naive fine-tuning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
