TL;DR
The paper introduces Pool of Experts (PoE), a framework that quickly creates lightweight, task-specific models from generic neural networks without training, enabling real-time querying for mobile and embedded applications.
Contribution
PoE offers a novel train-free method to extract and combine experts from a generic network, enabling instant, task-specific model creation without additional training.
Findings
PoE builds accurate, compact models in real-time.
It significantly reduces model customization time compared to traditional training.
Empirical results show high accuracy with minimal latency.
Abstract
In spite of the great success of deep learning technologies, training and delivery of a practically serviceable model is still a highly time-consuming process. Furthermore, a resulting model is usually too generic and heavyweight, and hence essentially goes through another expensive model compression phase to fit in a resource-limited device like embedded systems. Inspired by the fact that a machine learning task specifically requested by mobile users is often much simpler than it is supported by a massive generic model, this paper proposes a framework, called Pool of Experts (PoE), that instantly builds a lightweight and task-specific model without any training process. For a realtime model querying service, PoE first extracts a pool of primitive components, called experts, from a well-trained and sufficiently generic network by exploiting a novel conditional knowledge distillation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methodstravel james · Knowledge Distillation
