CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images
Sheng Guo, Weilin Huang, Haozhi Zhang, Chenfan Zhuang, Dengke Dong,, Matthew R. Scott, Dinglong Huang

TL;DR
CurriculumNet introduces a curriculum learning approach for training deep neural networks on large-scale, noisy web images without human annotations, significantly improving performance and robustness.
Contribution
It develops an unsupervised data complexity measurement and curriculum strategy to effectively handle noisy labels and data imbalance in web image datasets.
Findings
Achieved state-of-the-art results on WebVision, ImageNet, Clothing-1M, and Food-101.
Surprisingly, noisy images can enhance model generalization as a form of regularization.
Top-5 error rate of 5.2% on WebVision challenge, outperforming previous methods.
Abstract
We present a simple yet efficient approach capable of training deep neural networks on large-scale weakly-supervised web images, which are crawled raw from the Internet by using text queries, without any human annotation. We develop a principled learning strategy by leveraging curriculum learning, with the goal of handling a massive amount of noisy labels and data imbalance effectively. We design a new learning curriculum by measuring the complexity of data using its distribution density in a feature space, and rank the complexity in an unsupervised manner. This allows for an efficient implementation of curriculum learning on large-scale web images, resulting in a high-performance CNN model, where the negative impact of noisy labels is reduced substantially. Importantly, we show by experiments that those images with highly noisy labels can surprisingly improve the generalization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications
