Knowledge Concentration: Learning 100K Object Classifiers in a Single CNN
Jiyang Gao, Zijian (James) Guo, Zhen Li, Ram Nevatia

TL;DR
This paper introduces a novel knowledge concentration approach that distills knowledge from multiple specialist networks into a single CNN, enabling efficient classification of 100K object categories with improved performance.
Contribution
It proposes a multi-teacher single-student knowledge distillation framework with self-paced learning and structurally connected layers for large-scale fine-grained image classification.
Findings
Significantly outperforms baseline models on OpenImage and EFT datasets.
Efficiently handles 100K categories with a single CNN.
Reduces model size and training complexity compared to multiple experts.
Abstract
Fine-grained image labels are desirable for many computer vision applications, such as visual search or mobile AI assistant. These applications rely on image classification models that can produce hundreds of thousands (e.g. 100K) of diversified fine-grained image labels on input images. However, training a network at this vocabulary scale is challenging, and suffers from intolerable large model size and slow training speed, which leads to unsatisfying classification performance. A straightforward solution would be training separate expert networks (specialists), with each specialist focusing on learning one specific vertical (e.g. cars, birds...). However, deploying dozens of expert networks in a practical system would significantly increase system complexity and inference latency, and consumes large amounts of computational resources. To address these challenges, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
