Understanding Difficulty-based Sample Weighting with a Universal   Difficulty Measure

Xiaoling Zhou; Ou Wu; Weiyao Zhu; Ziyang Liang

arXiv:2301.04850·cs.LG·January 13, 2023

Understanding Difficulty-based Sample Weighting with a Universal Difficulty Measure

Xiaoling Zhou, Ou Wu, Weiyao Zhu, Ziyang Liang

PDF

Open Access

TL;DR

This paper introduces a universal difficulty measure based on generalization error for sample weighting in deep learning, providing theoretical insights into its effectiveness for improving model training and generalization.

Contribution

It proposes a theoretically guaranteed universal difficulty measure and explains why difficulty-based weighting enhances deep learning performance.

Findings

01

Generalization error can serve as a universal difficulty measure.

02

Difficulty-based weighting positively influences optimization dynamics.

03

Theoretical justification for the effectiveness of difficulty-based weighting.

Abstract

Sample weighting is widely used in deep learning. A large number of weighting methods essentially utilize the learning difficulty of training samples to calculate their weights. In this study, this scheme is called difficulty-based weighting. Two important issues arise when explaining this scheme. First, a unified difficulty measure that can be theoretically guaranteed for training samples does not exist. The learning difficulties of the samples are determined by multiple factors including noise level, imbalance degree, margin, and uncertainty. Nevertheless, existing measures only consider a single factor or in part, but not in their entirety. Second, a comprehensive theoretical explanation is lacking with respect to demonstrating why difficulty-based weighting schemes are effective in deep learning. In this study, we theoretically prove that the generalization error of a sample can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Fault Detection and Control Systems · Adversarial Robustness in Machine Learning