Look-ups are not (yet) all you need for deep learning inference

Calvin McCarter; Nicholas Dronen

arXiv:2207.05808·cs.LG·July 14, 2022·1 cites

Look-ups are not (yet) all you need for deep learning inference

Calvin McCarter, Nicholas Dronen

PDF

Open Access 1 Repo

TL;DR

This paper explores fast hash-based approximations for matrix multiplication to accelerate neural network inference, proposing improvements and fine-tuning methods, but finds accuracy remains significantly below exact methods.

Contribution

It introduces targeted improvements and a fine-tuning procedure for hash-based matrix multiplication approximations in deep learning inference.

Findings

01

Improvements to previous hash-based approximation methods.

02

Fine-tuning accelerates neural networks with minimal accuracy loss.

03

Overall accuracy remains below that of exact matrix multiplication.

Abstract

Fast approximations to matrix multiplication have the potential to dramatically reduce the cost of neural network inference. Recent work on approximate matrix multiplication proposed to replace costly multiplications with table-lookups by fitting a fast hash function from training data. In this work, we propose improvements to this previous work, targeted to the deep learning inference setting, where one has access to both training data and fixed (already learned) model weight matrices. We further propose a fine-tuning procedure for accelerating entire neural networks while minimizing loss in accuracy. Finally, we analyze the proposed method on a simple image classification task. While we show improvements to prior work, overall classification accuracy remains substantially diminished compared to exact matrix multiplication. Our work, despite this negative result, points the way towards…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

calvinmccarter/itlumm
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Advanced Image and Video Retrieval Techniques