Frobenius-Type Norms and Inner Products of Matrices and Linear Maps with   Applications to Neural Network Training

Roland Herzog; Frederik K\"ohne; Leonie Kreis; Anton Schiela

arXiv:2311.15419·cs.LG·November 28, 2023·1 cites

Frobenius-Type Norms and Inner Products of Matrices and Linear Maps with Applications to Neural Network Training

Roland Herzog, Frederik K\"ohne, Leonie Kreis, Anton Schiela

PDF

Open Access

TL;DR

This paper broadens the understanding of Frobenius norms and inner products for matrices and linear maps, revealing a family of norms that can be used to improve neural network training through preconditioning.

Contribution

It introduces a generalization of the Frobenius norm and inner product, showing their dependence on domain and co-domain inner products, enabling new preconditioning techniques for neural networks.

Findings

01

Frobenius-type norms depend on domain and co-domain inner products.

02

The classical Frobenius norm is a special case within a broader family.

03

These generalized norms can be used to enhance neural network training.

Abstract

The Frobenius norm is a frequent choice of norm for matrices. In particular, the underlying Frobenius inner product is typically used to evaluate the gradient of an objective with respect to matrix variable, such as those occuring in the training of neural networks. We provide a broader view on the Frobenius norm and inner product for linear maps or matrices, and establish their dependence on inner products in the domain and co-domain spaces. This shows that the classical Frobenius norm is merely one special element of a family of more general Frobenius-type norms. The significant extra freedom furnished by this realization can be used, among other things, to precondition neural network training.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMatrix Theory and Algorithms