LU decomposition and Toeplitz decomposition of a neural network

Yucong Liu; Simiao Jiao; and Lek-Heng Lim

arXiv:2211.13935·cs.LG·November 28, 2022·1 cites

LU decomposition and Toeplitz decomposition of a neural network

Yucong Liu, Simiao Jiao, and Lek-Heng Lim

PDF

Open Access

TL;DR

This paper proves that neural networks can be approximated by structured matrices like LU and Toeplitz, enabling parameter reduction without losing universal approximation capability, with practical experiments confirming efficiency gains.

Contribution

The paper introduces neural network approximations using LU and Toeplitz matrix decompositions, extending universal approximation theorems to structured matrices and convolutional networks.

Findings

01

Structured matrices reduce parameters significantly

02

Universal approximation holds with LU and Toeplitz matrices

03

Experiments show minimal accuracy loss with structured constraints

Abstract

It is well-known that any matrix $A$ has an LU decomposition. Less well-known is the fact that it has a 'Toeplitz decomposition' $A = T_{1} T_{2} \dots T_{r}$ where $T_{i}$ 's are Toeplitz matrices. We will prove that any continuous function $f : R^{n} \to R^{m}$ has an approximation to arbitrary accuracy by a neural network that takes the form $L_{1} σ_{1} U_{1} σ_{2} L_{2} σ_{3} U_{2} \dots L_{r} σ_{2 r - 1} U_{r}$ , i.e., where the weight matrices alternate between lower and upper triangular matrices, $σ_{i} (x) := σ (x - b_{i})$ for some bias vector $b_{i}$ , and the activation $σ$ may be chosen to be essentially any uniformly continuous nonpolynomial function. The same result also holds with Toeplitz matrices, i.e., $f \approx T_{1} σ_{1} T_{2} σ_{2} \dots σ_{r - 1} T_{r}$ to arbitrary accuracy, and likewise for Hankel matrices. A consequence of our Toeplitz result…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMatrix Theory and Algorithms · Neural Networks and Applications · Blind Source Separation Techniques

MethodsTest