Neglected Free Lunch -- Learning Image Classifiers Using Annotation Byproducts
Dongyoon Han, Junsuk Choe, Seonghyeok Chun, John Joon Young Chung,, Minsuk Chang, Sangdoo Yun, Jean Y. Song, Seong Joon Oh

TL;DR
This paper introduces a new training paradigm called learning using annotation byproducts (LUAB), which leverages auxiliary annotation data like mouse traces to improve image classifier robustness without extra annotation costs.
Contribution
It demonstrates that incorporating annotation byproducts as auxiliary tasks enhances model generalization and robustness in image classification.
Findings
LUAB improves classifier robustness and generalization.
Annotation byproducts serve as weak human attention signals.
No additional annotation costs are needed for LUAB.
Abstract
Supervised learning of image classifiers distills human knowledge into a parametric model through pairs of images and corresponding labels (X,Y). We argue that this simple and widely used representation of human knowledge neglects rich auxiliary information from the annotation procedure, such as the time-series of mouse traces and clicks left after image selection. Our insight is that such annotation byproducts Z provide approximate human attention that weakly guides the model to focus on the foreground cues, reducing spurious correlations and discouraging shortcut learning. To verify this, we create ImageNet-AB and COCO-AB. They are ImageNet and COCO training sets enriched with sample-wise annotation byproducts, collected by replicating the respective original annotation tasks. We refer to the new paradigm of training models with annotation byproducts as learning using annotation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Generative Adversarial Networks and Image Synthesis
