APINT: A Full-Stack Framework for Acceleration of Privacy-Preserving Inference of Transformers based on Garbled Circuits
Hyunjun Cho, Jaeho Jeon, Jaehoon Heo, Joo-Young Kim

TL;DR
APINT is a comprehensive full-stack framework that significantly accelerates privacy-preserving inference of transformers by optimizing garbled circuits and hardware, reducing latency and energy consumption.
Contribution
It introduces a novel protocol, circuit generation, scheduling, and hardware acceleration techniques to address GC latency bottlenecks in PiT.
Findings
Achieves 12.2x latency reduction on CPU platform.
Reduces hardware accelerator latency by 3.3x.
Saves energy consumption by 4.6x.
Abstract
As the importance of Privacy-Preserving Inference of Transformers (PiT) increases, a hybrid protocol that integrates Garbled Circuits (GC) and Homomorphic Encryption (HE) is emerging for its implementation. While this protocol is preferred for its ability to maintain accuracy, it has a severe drawback of excessive latency. To address this, existing protocols primarily focused on reducing HE latency, thus making GC the new latency bottleneck. Furthermore, previous studies only focused on individual computing layers, such as protocol or hardware accelerator, lacking a comprehensive solution at the system level. This paper presents APINT, a full-stack framework designed to reduce PiT's overall latency by addressing the latency problem of GC through both software and hardware solutions. APINT features a novel protocol that reallocates possible GC workloads to alternative methods (i.e., HE…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhysical Unclonable Functions (PUFs) and Hardware Security · Privacy-Preserving Technologies in Data · Cryptography and Data Security
