Leveraging KANs for Expedient Training of Multichannel MLPs via Preconditioning and Geometric Refinement

Jonas A. Actor; Graham Harper; Ben Southworth; Eric C. Cyr

arXiv:2505.18131·cs.LG·May 26, 2025

Leveraging KANs for Expedient Training of Multichannel MLPs via Preconditioning and Geometric Refinement

Jonas A. Actor, Graham Harper, Ben Southworth, Eric C. Cyr

PDF

TL;DR

This paper introduces a novel training acceleration method for multichannel MLPs by leveraging the structural properties of Kolmogorov-Arnold Networks (KANs), leading to faster training and better accuracy in scientific machine learning tasks.

Contribution

It establishes a structural link between KANs and multichannel MLPs, enabling a hierarchical refinement scheme for expedited training and improved model performance.

Findings

01

Hierarchical refinement accelerates MLP training significantly.

02

Training of spline knot locations enhances accuracy.

03

Structural insights from KANs improve training efficiency.

Abstract

Multilayer perceptrons (MLPs) are a workhorse machine learning architecture, used in a variety of modern deep learning frameworks. However, recently Kolmogorov-Arnold Networks (KANs) have become increasingly popular due to their success on a range of problems, particularly for scientific machine learning tasks. In this paper, we exploit the relationship between KANs and multichannel MLPs to gain structural insight into how to train MLPs faster. We demonstrate the KAN basis (1) provides geometric localized support, and (2) acts as a preconditioned descent in the ReLU basis, overall resulting in expedited training and improved accuracy. Our results show the equivalence between free-knot spline KAN architectures, and a class of MLPs that are refined geometrically along the channel dimension of each weight tensor. We exploit this structural equivalence to define a hierarchical refinement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.