LUT-KAN: Segment-wise LUT Quantization for Fast KAN Inference
Oleksandr Kuznetsov

TL;DR
LUT-KAN introduces a segment-wise LUT quantization method for KAN layers, significantly accelerating inference while maintaining accuracy, and providing a reproducible evaluation framework for spline-based models.
Contribution
It proposes a novel LUT-based quantization approach for KAN layers, enabling fast, accurate, and reproducible inference on CPU with explicit boundary and out-of-bounds policies.
Findings
Achieves up to 12x speedup in CPU inference latency.
Maintains classification accuracy with negligible F1 score drop.
Reduces memory overhead by approximately 10x at LUT resolution 64.
Abstract
Kolmogorov--Arnold Networks (KAN) replace scalar weights by learnable univariate functions, often implemented with B-splines. This design can be accurate and interpretable, but it makes inference expensive on CPU because each layer requires many spline evaluations. Standard quantization toolchains are also hard to apply because the main computation is not a matrix multiply but repeated spline basis evaluation. This paper introduces LUT-KAN, a segment-wise lookup-table (LUT) compilation and quantization method for PyKAN-style KAN layers. LUT-KAN converts each edge function into a per-segment LUT with affine int8/uint8 quantization and linear interpolation. The method provides an explicit and reproducible inference contract, including boundary conventions and out-of-bounds (OOB) policies. We propose an ``honest baseline'' methodology for speed evaluation: B-spline evaluation and LUT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Security and Verification in Computing · Cryptographic Implementations and Security
