Multi-Exit Kolmogorov-Arnold Networks: enhancing accuracy and parsimony

James Bagrow; Josh Bongard

arXiv:2506.03302·cs.LG·August 22, 2025

Multi-Exit Kolmogorov-Arnold Networks: enhancing accuracy and parsimony

James Bagrow, Josh Bongard

PDF

TL;DR

This paper introduces multi-exit Kolmogorov-Arnold Networks that enable early predictions at multiple depths, improving training and interpretability while maintaining high accuracy across various datasets.

Contribution

The paper proposes multi-exit KANs with a novel differentiable learning-to-exit algorithm, enhancing model efficiency, interpretability, and performance in scientific modeling tasks.

Findings

01

Multi-exit KANs outperform single-exit versions on diverse datasets.

02

Early exits often provide the most accurate and interpretable predictions.

03

The learning-to-exit algorithm effectively balances contributions from multiple exits.

Abstract

Kolmogorov-Arnold Networks (KANs) uniquely combine high accuracy with interpretability, making them valuable for scientific modeling. However, it is unclear a priori how deep a network needs to be for any given task, and deeper KANs can be difficult to optimize and interpret. Here we introduce multi-exit KANs, where each layer includes its own prediction branch, enabling the network to make accurate predictions at multiple depths simultaneously. This architecture provides deep supervision that improves training while discovering the right level of model complexity for each task. Multi-exit KANs consistently outperform standard, single-exit versions on synthetic functions, dynamical systems, and real-world datasets. Remarkably, the best predictions often come from earlier, simpler exits, revealing that these networks naturally identify smaller, more parsimonious and interpretable models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.