KANtize: Exploring Low-bit Quantization of Kolmogorov-Arnold Networks for Efficient Inference

Sohaib Errabii; Olivier Sentieys; Marcello Traiola

arXiv:2603.17230·cs.AR·March 19, 2026

KANtize: Exploring Low-bit Quantization of Kolmogorov-Arnold Networks for Efficient Inference

Sohaib Errabii, Olivier Sentieys, Marcello Traiola

PDF

Open Access

TL;DR

This paper explores low-bit quantization of Kolmogorov-Arnold Networks (KANs), demonstrating significant computational and hardware efficiency gains with minimal accuracy loss, especially when using 2-3 bit quantization.

Contribution

It introduces low-bit quantization techniques for KANs, enabling efficient inference with negligible accuracy loss and substantial hardware resource savings.

Findings

01

Quantizing B-splines to 2-3 bits maintains accuracy.

02

50x reduction in BitOps with low-bit quantized B-spline tables.

03

Hardware implementations show significant resource and speed improvements.

Abstract

Kolmogorov-Arnold Networks (KANs) have gained attention for their potential to outperform Multi-Layer Perceptrons (MLPs) in terms of parameter efficiency and interpretability. Unlike traditional MLPs, KANs use learnable non-linear activation functions, typically spline functions, expressed as linear combinations of basis splines (B-splines). B-spline coefficients serve as the model's learnable parameters. However, evaluating these spline functions increases computational complexity during inference. Conventional quantization reduces this complexity by lowering the numerical precision of parameters and activations. However, the impact of quantization on KANs, and especially its effectiveness in reducing computational complexity, is largely unexplored, particularly for quantization levels below 8 bits. The study investigates the impact of low-bit quantization on KANs and its impact on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNumerical Methods and Algorithms · Ferroelectric and Negative Capacitance Devices · Parallel Computing and Optimization Techniques