Look-ups are not (yet) all you need for deep learning inference
Calvin McCarter, Nicholas Dronen

TL;DR
This paper explores fast hash-based approximations for matrix multiplication to accelerate neural network inference, proposing improvements and fine-tuning methods, but finds accuracy remains significantly below exact methods.
Contribution
It introduces targeted improvements and a fine-tuning procedure for hash-based matrix multiplication approximations in deep learning inference.
Findings
Improvements to previous hash-based approximation methods.
Fine-tuning accelerates neural networks with minimal accuracy loss.
Overall accuracy remains below that of exact matrix multiplication.
Abstract
Fast approximations to matrix multiplication have the potential to dramatically reduce the cost of neural network inference. Recent work on approximate matrix multiplication proposed to replace costly multiplications with table-lookups by fitting a fast hash function from training data. In this work, we propose improvements to this previous work, targeted to the deep learning inference setting, where one has access to both training data and fixed (already learned) model weight matrices. We further propose a fine-tuning procedure for accelerating entire neural networks while minimizing loss in accuracy. Finally, we analyze the proposed method on a simple image classification task. While we show improvements to prior work, overall classification accuracy remains substantially diminished compared to exact matrix multiplication. Our work, despite this negative result, points the way towards…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Advanced Image and Video Retrieval Techniques
