Truncated multiplication and batch software SIMD AVX512 implementation for faster Montgomery multiplications and modular exponentiation
Laurent-St\'ephane Didier (IMATH), Nadia Mrabet, L\'ea Glandus, (IMATH), Jean-Marc Robert (IMATH)

TL;DR
This paper introduces optimized SIMD AVX512 implementations for batch multi-precision modular arithmetic, including a novel truncated Montgomery multiplication that significantly accelerates cryptographic computations.
Contribution
It presents a new truncated multiplication method for faster Montgomery reductions and demonstrates substantial speed improvements over existing libraries in batch cryptographic operations.
Findings
Truncated Montgomery multiplication speeds up modular reduction by nearly 20%.
Our implementations are over 4 times faster than GMP and OpenSSL for modular operations.
Speedups of 1.75x and 1.38x in fixed-window exponentiation for 1024 and 2048-bit sizes, respectively.
Abstract
This paper presents software implementations of batch computations, dealing with multi-precision integer operations. In this work, we use the Single Instruction Multiple Data (SIMD) AVX512 instruction set of the x86-64 processors, in particular the vectorized fused multiplier-adder VPMADD52. We focus on batch multiplications, squarings, modular multiplications, modular squarings and constant time modular exponentiations of 8 values using a word-slicing storage. We explore the use of Schoolbook and Karatsuba approaches with operands up to 4108 and 4154 bits respectively. We also introduce a truncated multiplication that speeds up the computation of the Montgomery modular reduction in the context of software implementation. Our Truncated Montgomery modular multiplication improvement offers speed gains of almost 20 % over the conventional non-truncated versions. Compared to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
