Efficient Inference via Universal LSH Kernel

Zichang Liu; Benjamin Coleman; Anshumali Shrivastava

arXiv:2106.11426·cs.LG·June 23, 2021

Efficient Inference via Universal LSH Kernel

Zichang Liu, Benjamin Coleman, Anshumali Shrivastava

PDF

Open Access

TL;DR

This paper introduces the Representer Sketch, a mathematically provable method for approximating large model inference efficiently using hashing, achieving significant reductions in storage and computation without accuracy loss.

Contribution

The paper presents the Representer Sketch, a novel kernel-based sketching method that enables efficient inference for large models, surpassing existing techniques like quantization and pruning.

Findings

01

Up to 114x reduction in storage requirements.

02

Up to 59x reduction in computation complexity.

03

No accuracy drop observed.

Abstract

Large machine learning models achieve unprecedented performance on various tasks and have evolved as the go-to technique. However, deploying these compute and memory hungry models on resource constraint environments poses new challenges. In this work, we propose mathematically provable Representer Sketch, a concise set of count arrays that can approximate the inference procedure with simple hashing computations and aggregations. Representer Sketch builds upon the popular Representer Theorem from kernel literature, hence the name, providing a generic fundamental alternative to the problem of efficient inference that goes beyond the popular approach such as quantization, iterative pruning and knowledge distillation. A neural network function is transformed to its weighted kernel density representation, which can be very efficiently estimated with our sketching algorithm. Empirically, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Machine Learning and Data Classification

MethodsPruning