HERO-Sign: Hierarchical Tuning and Efficient Compiler-Time GPU Optimizations for SPHINCS+ Signature Generation
Yaoyun Zhou, Qian Wang

TL;DR
HERO Sign is a GPU-optimized implementation of SPHINCS+ that employs hierarchical tuning and compiler strategies to significantly accelerate signature generation, achieving up to 3.13x throughput improvements over existing methods.
Contribution
The paper introduces a hierarchical tuning framework and compiler-level optimizations specifically designed for GPU acceleration of SPHINCS+ signature generation, including a Tree Fusion strategy and adaptive compilation techniques.
Findings
Achieves up to 3.13x throughput improvement over state-of-the-art GPU implementations.
Reduces kernel launch latency by two orders of magnitude.
Effectively exploits parallelism in SPHINCS+ components across various GPU architectures.
Abstract
SPHINCS+ is a stateless hash-based signature scheme that provides strong post quantum security, but its signature generation is slow due to intensive hash computations. GPUs offer massive parallelism that can potentially accelerate SPHINCS+ signatures. However, existing GPU-based optimizations either fail to fully exploit the inherent parallelism of SPHINCS+'s Merkle tree structure or lack fine-grained, compiler-level customization across its diverse computational kernels. This paper proposes HERO Sign, a GPU-accelerated SPHINCS+ implementation that adopts hierarchical tuning and efficient compiler time optimizations. HERO Sign reexamines the parallelization opportunities enabled by data independence across SPHINCS+ components, including FORS, MSS, and WOTS+. It introduces a Tree Fusion strategy for FORS, which contains a large number of independent branches. The fusion strategy is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCryptographic Implementations and Security · Parallel Computing and Optimization Techniques · Security and Verification in Computing
