Model Compression in Practice: Lessons Learned from Practitioners Creating On-device Machine Learning Experiences
Fred Hohman, Mary Beth Kery, Donghao Ren, Dominik Moritz

TL;DR
This paper shares practical insights and lessons learned from experts at Apple on compressing large ML models for on-device use, aiming to make efficient on-device ML more accessible and widespread.
Contribution
It compiles tacit knowledge and practical strategies from industry experts to guide model compression for on-device ML, filling gaps in existing research.
Findings
Pragmatic considerations for model compression design process
Trade-offs between model size, accuracy, and efficiency
Design recommendations for tooling to facilitate on-device ML
Abstract
On-device machine learning (ML) promises to improve the privacy, responsiveness, and proliferation of new, intelligent user experiences by moving ML computation onto everyday personal devices. However, today's large ML models must be drastically compressed to run efficiently on-device, a hurtle that requires deep, yet currently niche expertise. To engage the broader human-centered ML community in on-device ML experiences, we present the results from an interview study with 30 experts at Apple that specialize in producing efficient models. We compile tacit knowledge that experts have developed through practical experience with model compression across different hardware platforms. Our findings offer pragmatic considerations missing from prior work, covering the design process, trade-offs, and technical strategies that go into creating efficient models. Finally, we distill design…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Advanced Data Storage Technologies · Scientific Computing and Data Management
