Wahkon: A Statistically Principled Deep RKHS Superposition Network

Yongkai Chen; Wenxuan Zhong; Ping Ma

arXiv:2605.14041·stat.ME·May 15, 2026

Wahkon: A Statistically Principled Deep RKHS Superposition Network

Yongkai Chen, Wenxuan Zhong, Ping Ma

PDF

TL;DR

Wahkon introduces a deep RKHS superposition network that combines Kolmogorov's principle with RKHS regularization, offering finite-sample guarantees, interpretability, and superior performance over traditional neural networks.

Contribution

It unifies Kolmogorov's superposition principle with RKHS regularization, providing a tractable deep representer theorem and explicit complexity control in deep learning.

Findings

01

Wahkon outperforms MLPs, NTKs, and Kolmogorov--Arnold Networks in benchmarks.

02

Establishes minimax-optimal convergence rates under mild smoothness assumptions.

03

Provides a finite-dimensional deep representer theorem with explicit layerwise complexity control.

Abstract

Deep learning excels at prediction but often lacks finite-sample guarantees and calibrated uncertainty; RKHS (Reproducing Kernel Hilbert Space)-based methods provide those guarantees but struggle to adapt in high dimensions. We propose Wahkon, a deep RKHS superposition network that unifies Kolmogorov's superposition principle with RKHS regularization in the smoothing-spline tradition of Wahba. This yields a finite-dimensional deep representer theorem that makes training tractable and provides explicit layerwise complexity control. We show the penalized estimator is exactly the MAP (maximum a posteriori) estimate under a hierarchical Gaussian-process prior, extending the spline/GP duality to deep compositions. Using metric-entropy arguments, we establish minimax-optimal convergence rates under mild smoothness and clarify how depth and width trade off with regularity. Empirically, Wahkon…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.