Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach
Zeren Sun, Yazhou Yao, Xiu-Shen Wei, Yongshun Zhang, Fumin Shen,, Jianxin Wu, Jian Zhang, Heng-Tao Shen

TL;DR
This paper introduces two large-scale webly supervised fine-grained datasets and proposes a novel peer-learning method, demonstrating superior performance in fine-grained recognition tasks without extensive manual labeling.
Contribution
The paper provides the first high-quality benchmark datasets for webly supervised fine-grained recognition and introduces a novel peer-learning approach for improved performance.
Findings
Peer-learning outperforms baseline models.
New datasets enable large-scale webly supervised fine-grained recognition.
Achieved state-of-the-art results on WebFG-496 and WebiNat-5089.
Abstract
Learning from the web can ease the extreme dependence of deep learning on large-scale manually labeled datasets. Especially for fine-grained recognition, which targets at distinguishing subordinate categories, it will significantly reduce the labeling costs by leveraging free web data. Despite its significant practical and research value, the webly supervised fine-grained recognition problem is not extensively studied in the computer vision community, largely due to the lack of high-quality datasets. To fill this gap, in this paper we construct two new benchmark webly supervised fine-grained datasets, termed WebFG-496 and WebiNat-5089, respectively. In concretely, WebFG-496 consists of three sub-datasets containing a total of 53,339 web training images with 200 species of birds (Web-bird), 100 types of aircrafts (Web-aircraft), and 196 models of cars (Web-car). For WebiNat-5089, it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
