Reweighting Improves Conditional Risk Bounds
Yikai Zhang, Jiahe Lin, Fengpei Li, Songzhu Zheng, Anant Raj, Anderson, Schneider, Yuriy Nevmyvaka

TL;DR
This paper demonstrates that weighted empirical risk minimization, under a Bernstein condition, can outperform standard ERM in specific data regions, especially in classification and heteroscedastic regression, supported by synthetic experiments.
Contribution
It introduces a weighted ERM approach that leverages data-dependent weights to improve risk bounds in certain sub-regions, under a general Bernstein condition.
Findings
Weighted ERM achieves better bounds in large-margin classification regions.
Weighted ERM improves performance in low-variance heteroscedastic regression.
Synthetic data experiments support the theoretical advantages.
Abstract
In this work, we study the weighted empirical risk minimization (weighted ERM) schema, in which an additional data-dependent weight function is incorporated when the empirical risk function is being minimized. We show that under a general ``balanceable" Bernstein condition, one can design a weighted ERM estimator to achieve superior performance in certain sub-regions over the one obtained from standard ERM, and the superiority manifests itself through a data-dependent constant term in the error bound. These sub-regions correspond to large-margin ones in classification settings and low-variance ones in heteroscedastic regression settings, respectively. Our findings are supported by evidence from synthetic data experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Imbalanced Data Classification Techniques · Explainable Artificial Intelligence (XAI)
