TL;DR
This paper introduces a novel deep learning framework that enhances multi-label image classification by distilling knowledge from a weakly-supervised detection model, eliminating the need for bounding box annotations and improving accuracy and efficiency.
Contribution
The paper proposes a new end-to-end framework that uses knowledge distillation from a weakly-supervised detection model to improve multi-label classification without bounding box labels.
Findings
Achieves superior performance on MS-COCO and NUS-WIDE datasets.
Significantly improves classification accuracy compared to state-of-the-art methods.
Maintains efficiency by discarding the teacher model during testing.
Abstract
Multi-label image classification is a fundamental but challenging task towards general visual understanding. Existing methods found the region-level cues (e.g., features from RoIs) can facilitate multi-label classification. Nevertheless, such methods usually require laborious object-level annotations (i.e., object labels and bounding boxes) for effective learning of the object-level visual features. In this paper, we propose a novel and efficient deep framework to boost multi-label classification by distilling knowledge from weakly-supervised detection task without bounding box annotations. Specifically, given the image-level annotations, (1) we first develop a weakly-supervised detection (WSD) model, and then (2) construct an end-to-end multi-label image classification framework augmented by a knowledge distillation module that guides the classification model by the WSD model according…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
