Robust Basis Spline Decoupling for the Compression of Transformer Models
Joppe De Jonghe, Van Tien Pham, Mariya Ishteva

TL;DR
This paper introduces a B-spline-based decoupling framework for neural network compression, offering improved stability and expressiveness over traditional polynomial methods, and demonstrates its effectiveness on transformer models.
Contribution
It proposes a novel B-spline decoupling approach with a robust optimization algorithm, enhancing neural network compression techniques.
Findings
Enables significant parameter reduction in transformer models.
Maintains competitive accuracy with compressed models.
Provides a numerically stable and expressive decoupling method.
Abstract
Decoupling is a powerful modeling paradigm for representing multivariate functions as compositions of linear transformations and univariate nonlinear functions. A single-layer decoupling can be viewed as a fully connected neural network with a single hidden layer and flexible activation functions, providing a direct link with neural networks. Because of this, the use of decoupling methods has gained increasing attention in neural network domains, particularly compression, since it enables structured approximations with reduced parameter complexity. Existing tensor-based decoupling methods typically rely on polynomial or piecewise-linear parameterizations of the internal nonlinear functions, which can suffer from numerical instability or limited expressiveness. In this work, we introduce a B-spline-based decoupling framework that generalizes these existing approaches. By exploiting the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
