SUPN: Shallow Universal Polynomial Networks
Zachary Morrow, Michael Penwarden, Brian Chen, Aurya Javeed, Akil Narayan, John D. Jakeman

TL;DR
SUPNs are shallow networks combining polynomials and neural network principles, achieving high expressivity with fewer parameters and outperforming traditional deep networks and polynomial methods in approximation tasks.
Contribution
This paper introduces SUPNs, a novel shallow network architecture that replaces multiple layers with a polynomial layer, providing theoretical convergence guarantees and superior approximation performance.
Findings
SUPNs converge at the same rate as the best polynomial approximation.
SUPNs often outperform DNNs and KANs in approximation error for a given parameter count.
SUPNs outperform polynomial projection on non-smooth functions.
Abstract
Deep neural networks (DNNs) and Kolmogorov-Arnold networks (KANs) are popular methods for function approximation due to their flexibility and expressivity. However, they typically require a large number of trainable parameters to produce a suitable approximation. Beyond making the resulting network less transparent, overparameterization creates a large optimization space, likely producing local minima in training that have quite different generalization errors. In this case, network initialization can have an outsize impact on the model's out-of-sample accuracy. For these reasons, we propose shallow universal polynomial networks (SUPNs). These networks replace all but the last hidden layer with a single layer of polynomials with learnable coefficients, leveraging the strengths of DNNs and polynomials to achieve sufficient expressivity with far fewer parameters. We prove that SUPNs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Polynomial and algebraic computation · Stochastic Gradient Optimization Techniques
