An Efficient Sparse Kernel Generator for O(3)-Equivariant Deep Networks
Vivek Bharadwaj, Austin Glover, Aydin Buluc, James Demmel

TL;DR
This paper presents a GPU sparse kernel generator for the Clebsch-Gordon tensor product in O(3)-equivariant neural networks, significantly accelerating computations and reducing memory usage for spatial deep learning tasks.
Contribution
The authors introduce a novel GPU kernel generator that optimizes the CG tensor product, achieving substantial speedups and efficiency improvements over existing implementations.
Findings
Up to 1.3x speedup over NVIDIA's cuEquivariance
10x speedup over e3nn package
6.2x inference-time speedup for MACE model
Abstract
Rotation equivariant graph neural networks, i.e. networks designed to guarantee certain geometric relations between their inputs and outputs, yield state of the art performance on spatial deep learning tasks. They exhibit high data efficiency during training and significantly reduced inference time for interatomic potential calculations compared to classical approaches. Key to these models is the Clebsch-Gordon (CG) tensor product, a kernel that contracts two dense feature vectors with a highly-structured sparse tensor to produce a dense output vector. The operation, which may be repeated millions of times for typical equivariant models, is a costly and inefficient bottleneck. We introduce a GPU sparse kernel generator for the CG tensor product that provides significant speedups over the best existing open and closed-source implementations. Our implementation achieves high performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis · Medical Image Segmentation Techniques · Seismic Imaging and Inversion Techniques
