Geometry-aware training of factorized layers in tensor Tucker format

Emanuele Zangrando; Steffen Schotth\"ofer; Gianluca Ceruti; Jonas Kusch; Francesco Tudisco

arXiv:2305.19059·cs.LG·August 22, 2025·1 cites

Geometry-aware training of factorized layers in tensor Tucker format

Emanuele Zangrando, Steffen Schotth\"ofer, Gianluca Ceruti, Jonas Kusch, Francesco Tudisco

PDF

Open Access 1 Video

TL;DR

This paper presents a novel geometry-aware training method for factorized neural network layers using Tucker decomposition, enabling dynamic rank adjustment, improved training efficiency, and competitive performance.

Contribution

It introduces a Tucker-based layer factorization training approach that is initialization-insensitive and dynamically updates ranks, with theoretical guarantees and practical benefits.

Findings

01

Achieves high compression rates during training.

02

Maintains or improves performance compared to full models.

03

Provides theoretical convergence and approximation guarantees.

Abstract

Reducing parameter redundancies in neural network architectures is crucial for achieving feasible computational and memory requirements during training and inference phases. Given its easy implementation and flexibility, one promising approach is layer factorization, which reshapes weight tensors into a matrix format and parameterizes them as the product of two small rank matrices. However, this approach typically requires an initial full-model warm-up phase, prior knowledge of a feasible rank, and it is sensitive to parameter initialization. In this work, we introduce a novel approach to train the factors of a Tucker decomposition of the weight tensors. Our training proposal proves to be optimal in locally approximating the original unfactorized dynamics independently of the initialization. Furthermore, the rank of each mode is dynamically updated during training. We provide a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Geometry-aware training of factorized layers in tensor Tucker format· slideslive

Taxonomy

TopicsTensor decomposition and applications · Model Reduction and Neural Networks · Computational Physics and Python Applications

Methodsfail · Pruning · TuckER · Focus