Impact of Noisy Supervision in Foundation Model Learning

Hao Chen; Zihan Wang; Ran Tao; Hongxin Wei; Xing Xie; Masashi; Sugiyama; Bhiksha Raj; Jindong Wang

arXiv:2403.06869·cs.LG·May 6, 2025·1 cites

Impact of Noisy Supervision in Foundation Model Learning

Hao Chen, Zihan Wang, Ran Tao, Hongxin Wei, Xing Xie, Masashi, Sugiyama, Bhiksha Raj, Jindong Wang

PDF

Open Access

TL;DR

This paper investigates the impact of label noise in large-scale pre-training datasets for foundation models, revealing that slight noise can help in-domain performance but harms out-of-domain generalization, and proposes a mitigation method called NMTune.

Contribution

It provides a comprehensive analysis of noise in pre-training datasets, demonstrating its effects on model generalization, and introduces NMTune to mitigate noise impacts across various models and tasks.

Findings

01

Slight noise benefits in-domain performance.

02

Noise deteriorates out-of-domain generalization.

03

NMTune effectively mitigates noise effects.

Abstract

Foundation models are usually pre-trained on large-scale datasets and then adapted to downstream tasks through tuning. However, the large-scale pre-training datasets, often inaccessible or too expensive to handle, can contain label noise that may adversely affect the generalization of the model and pose unexpected risks. This paper stands out as the first work to comprehensively understand and analyze the nature of noise in pre-training datasets and then effectively mitigate its impacts on downstream tasks. Specifically, through extensive experiments of fully-supervised and image-text contrastive pre-training on synthetic noisy ImageNet-1K, YFCC15M, and CC12M datasets, we demonstrate that, while slight noise in pre-training can benefit in-domain (ID) performance, where the training and testing data share a similar distribution, it always deteriorates out-of-domain (OOD) performance,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications