Disagreement-Regularized Importance Sampling for Adversarial Label Corruption

Csongor Horv\'ath; Ida-Maria Sintorn; Prashant Singh

arXiv:2605.07551·cs.LG·May 11, 2026

Disagreement-Regularized Importance Sampling for Adversarial Label Corruption

Csongor Horv\'ath, Ida-Maria Sintorn, Prashant Singh

PDF

TL;DR

This paper introduces Disagreement-Regularized Importance Sampling (DR-IS), a novel method that enhances robustness against adversarial label corruption by leveraging loss rank disagreement, with theoretical guarantees and empirical validation.

Contribution

The paper proposes DR-IS, a new subsampling technique based on rank disagreement, with proven concentration bounds and improved robustness over existing methods.

Findings

01

DR-IS maintains robustness under targeted high-norm attacks.

02

Theoretical bounds certify strict separation between clean and corrupted examples.

03

Empirical results show DR-IS outperforms magnitude-based methods like EL2N.

Abstract

Standard Importance Sampling (IS) collapses under label corruption because high-norm examples, prioritized for variance reduction, are often adversarial outliers. We formalize this misalignment using an $ε$ -contamination model and propose Disagreement-Regularized Importance Sampling (DR-IS), a sub-sampling method based on loss rank-disagreement across independent proxy ensemble. We prove finite-sample concentration bounds showing that the empirical rank disagreement of bulk corrupted examples is bounded above, and that of boundary-clean examples bounded below, both at rate $O (lo g (N / δ) / K)$ with probability $1 - δ$ ; when the structural expectation gap $Δ^{'}$ between the two groups is positive and the boundary-clean set is at least as large as the selected subset, these bounds certify strict separation and control the contamination rate of the selected subset.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.