K-Means as a Radial Basis function Network: a Variational and Gradient-based Equivalence
Felipe de Jesus Felix Arredondo, Alejandro Ucan-Puc, Carlos Astengo Noguez

TL;DR
This paper rigorously proves the equivalence between K-Means clustering and differentiable RBF neural networks, enabling joint optimization within deep learning architectures and addressing numerical stability issues with a novel softmax variant.
Contribution
It establishes a formal variational and gradient-based equivalence between K-Means and RBF networks, introducing Entmax-1.5 for stable low-temperature softmax and enabling end-to-end differentiable clustering.
Findings
RBF training trajectories match K-Means updates in the limit
Entmax-1.5 stabilizes low-temperature softmax computations
Soft RBF centroids converge monotonically to K-Means fixed points
Abstract
This work establishes a rigorous variational and gradient-based equivalence between the classical K-Means algorithm and differentiable Radial Basis Function (RBF) neural networks with smooth responsibilities. By reparameterizing the K-Means objective and embedding its distortion functional into a smooth weighted loss, we prove that the RBF objective -converges to the K-Means solution as the temperature parameter vanishes. We further demonstrate that the gradient-based updates of the RBF centers recover the exact K-Means centroid update rule and induce identical training trajectories in the limit. To address the numerical instability of the Softmax transformation in the low-temperature regime, we propose the integration of Entmax-1.5, which ensures stable polynomial convergence while preserving the underlying Voronoi partition structure. These results bridge the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Tensor decomposition and applications · Face and Expression Recognition
