IMAE for Noise-Robust Learning: Mean Absolute Error Does Not Treat Examples Equally and Gradient Magnitude's Variance Matters
Xinshao Wang, Yang Hua, Elyor Kodirov, David A. Clifton, Neil M., Robertson

TL;DR
This paper analyzes the limitations of MAE in noise-robust learning, revealing its unequal treatment of examples and proposing IMAE, which adjusts gradient magnitude variance to improve robustness and fitting ability.
Contribution
The paper uncovers MAE's underfitting and unequal example treatment issues and introduces IMAE, a simple variance adjustment method that enhances noise robustness and model fitting.
Findings
MAE is theoretically noise-robust but underfits in practice.
MAE emphasizes uncertain examples, not equal treatment.
IMAE improves fitting ability while maintaining noise robustness.
Abstract
In this work, we study robust deep learning against abnormal training data from the perspective of example weighting built in empirical loss functions, i.e., gradient magnitude with respect to logits, an angle that is not thoroughly studied so far. Consequently, we have two key findings: (1) Mean Absolute Error (MAE) Does Not Treat Examples Equally. We present new observations and insightful analysis about MAE, which is theoretically proved to be noise-robust. First, we reveal its underfitting problem in practice. Second, we analyse that MAE's noise-robustness is from emphasising on uncertain examples instead of treating training samples equally, as claimed in prior work. (2) The Variance of Gradient Magnitude Matters. We propose an effective and simple solution to enhance MAE's fitting ability while preserving its noise-robustness. Without changing MAE's overall weighting scheme, i.e.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
