PolyKAN: Efficient Fused GPU Operators for Polynomial Kolmogorov-Arnold Network Variants
Mingkun Yu, Heming Zhong, Dan Huang, Yutong Lu, Jiazhi Jiang

TL;DR
PolyKAN is a GPU-accelerated library that significantly improves the efficiency of Polynomial Kolmogorov-Arnold Network variants, enabling faster training and inference while maintaining accuracy.
Contribution
This work introduces the first open-source, optimized CUDA implementation of KAN and its variants, utilizing novel fusion and optimization techniques.
Findings
Achieves 1.2–10× faster inference
Achieves 1.4–12× faster training
Maintains identical accuracy across workloads
Abstract
Kolmogorov-Arnold Networks (KANs) promise higher expressive capability and stronger interpretability than Multi-Layer Perceptron, particularly in the domain of AI for Science. However, practical adoption has been hindered by low GPU utilization of existing parallel implementations. To address this challenge, we present a GPU-accelerated operator library, named PolyKAN which is the first general open-source implementation of KAN and its variants. PolyKAN fuses the forward and backward passes of polynomial KAN layers into a concise set of optimized CUDA kernels. Four orthogonal techniques underpin the design: (i) \emph{lookup-table} with linear interpolation that replaces runtime expensive math-library functions; (ii) \emph{2D tiling} to expose thread-level parallelism with preserving memory locality; (iii) a \emph{two-stage reduction} scheme converting scattered atomic updates into a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Memory and Neural Computing · Graph Theory and Algorithms
