Implicit Regularization in Deep Tensor Factorization

Paolo Milanesi (QARMA); Hachem Kadri (LIS; QARMA; AMU SCI); St\'ephane; Ayache (QARMA); Thierry Arti\`eres (QARMA)

arXiv:2105.01346·cs.AI·May 5, 2021

Implicit Regularization in Deep Tensor Factorization

Paolo Milanesi (QARMA), Hachem Kadri (LIS, QARMA, AMU SCI), St\'ephane, Ayache (QARMA), Thierry Arti\`eres (QARMA)

PDF

TL;DR

This paper investigates how gradient descent implicitly regularizes tensor completion tasks by promoting low-rank solutions, emphasizing the importance of dynamics over norm minimization in understanding this phenomenon.

Contribution

It extends the study of implicit regularization from matrices to tensors using Tucker and TT factorizations, introducing deep unconstrained models and analyzing their dynamics.

Findings

01

Gradient descent promotes low-rank tensor solutions.

02

Tensor nuclear norm and effective rank are key quantities in analysis.

03

Experiments validate the dynamical perspective of regularization.

Abstract

Attempts of studying implicit regularization associated to gradient descent (GD) have identified matrix completion as a suitable test-bed. Late findings suggest that this phenomenon cannot be phrased as a minimization-norm problem, implying that a paradigm shift is required and that dynamics has to be taken into account. In the present work we address the more general setup of tensor completion by leveraging two popularized tensor factorization, namely Tucker and TensorTrain (TT). We track relevant quantities such as tensor nuclear norm, effective rank, generalized singular values and we introduce deep Tucker and TT unconstrained factorization to deal with the completion task. Experiments on both synthetic and real data show that gradient descent promotes solution with low-rank, and validate the conjecture saying that the phenomenon has to be addressed from a dynamical perspective.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTuckER