Cheetah: Optimizing and Accelerating Homomorphic Encryption for Private   Inference

Brandon Reagen; Wooseok Choi; Yeongil Ko; Vincent Lee; Gu-Yeon Wei,; Hsien-Hsin S. Lee; David Brooks

arXiv:2006.00505·cs.CR·October 12, 2020

Cheetah: Optimizing and Accelerating Homomorphic Encryption for Private Inference

Brandon Reagen, Wooseok Choi, Yeongil Ko, Vincent Lee, Gu-Yeon Wei,, Hsien-Hsin S. Lee, David Brooks

PDF

TL;DR

Cheetah significantly accelerates homomorphic encryption-based neural network inference by combining algorithmic optimizations and custom hardware, making privacy-preserving AI inference nearly as fast as plaintext inference.

Contribution

The paper introduces novel HE-parameter tuning, operator scheduling, and a dedicated accelerator architecture to drastically improve HE inference speed.

Findings

01

Achieves 79x speedup over previous HE inference methods

02

Approaches plaintext inference speeds with hardware acceleration

03

Supports common neural networks like ResNet50, VGG16, and AlexNet

Abstract

As the application of deep learning continues to grow, so does the amount of data used to make predictions. While traditionally, big-data deep learning was constrained by computing performance and off-chip memory bandwidth, a new constraint has emerged: privacy. One solution is homomorphic encryption (HE). Applying HE to the client-cloud model allows cloud services to perform inference directly on the client's encrypted data. While HE can meet privacy constraints, it introduces enormous computational challenges and remains impractically slow in current systems. This paper introduces Cheetah, a set of algorithmic and hardware optimizations for HE DNN inference to achieve plaintext DNN inference speeds. Cheetah proposes HE-parameter tuning optimization and operator scheduling optimizations, which together deliver 79x speedup over the state-of-the-art. However, this still falls short of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.