TL;DR
This paper introduces multi-word modular arithmetic (MoMA) for efficient cryptographic kernel implementation on GPUs, significantly improving performance of homomorphic encryption and zero-knowledge proof operations.
Contribution
It formalizes MoMA and develops a rewrite system for efficient large integer arithmetic, enabling high-performance cryptographic kernels on GPUs.
Findings
MoMA-based BLAS outperforms existing multi-precision libraries
MoMA-based NTT achieves near-ASIC performance on GPUs
Cryptographic kernels show orders-of-magnitude speedup
Abstract
Fully homomorphic encryption (FHE) and zero-knowledge proofs (ZKPs) are emerging as solutions for data security in distributed environments. However, the widespread adoption of these encryption techniques is hindered by their significant computational overhead, primarily resulting from core cryptographic operations that involve large integer arithmetic. This paper presents a formalization of multi-word modular arithmetic (MoMA), which breaks down large bit-width integer arithmetic into operations on machine words. We further develop a rewrite system that implements MoMA through recursive rewriting of data types, designed for compatibility with compiler infrastructures and code generators. We evaluate MoMA by generating cryptographic kernels, including basic linear algebra subprogram (BLAS) operations and the number theoretic transform (NTT), targeting various GPUs. Our MoMA-based BLAS…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
