Efficient Additions and Montgomery Reductions of Large Integers for SIMD
Pengchang Ren, Reiji Suda, Vorapong Suppakitpaisarn

TL;DR
This paper introduces SIMD-optimized algorithms for large integer addition and Montgomery reduction, significantly improving the performance of post-quantum cryptography implementations on modern processors.
Contribution
A novel SIMD-friendly addition algorithm and precomputed Montgomery reduction techniques that outperform existing methods for large integers over 512 bits.
Findings
30% speed-up in CTIDH implementation
11% speed-up in CSIDH on AVX-512
7% speed-up for SIKEp503 on A64FX
Abstract
This paper presents efficient algorithms, designed to leverage SIMD for performing Montgomery reductions and additions on integers larger than 512 bits. The existing algorithms encounter inefficiencies when parallelized using SIMD due to extensive dependencies in both operations, particularly noticeable in costly operations like ARM's SVE. To mitigate this problem, a novel addition algorithm is introduced that simulates the addition of large integers using a smaller addition, quickly producing the same set of carries. These carries are then utilized to perform parallel additions on large integers. For Montgomery reductions, serial multiplications are replaced with precomputations that can be effectively calculated using SIMD extensions. Experimental evidence demonstrates that these proposed algorithms substantially enhance the performance of state-of-the-art implementations of several…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCoding theory and cryptography · Cryptography and Residue Arithmetic · Cryptography and Data Security
