Dropout as a Low-Rank Regularizer for Matrix Factorization
Jacopo Cavazza, Pietro Morerio, Benjamin Haeffele, Connor Lane,, Vittorio Murino, Rene Vidal

TL;DR
This paper provides a theoretical analysis of dropout in matrix factorization, showing it acts as a low-rank regularizer and achieves the global minimum in convex nuclear norm regularization problems.
Contribution
It establishes the equivalence between dropout and a deterministic regularization model for matrix factorization, and proves dropout's effectiveness as a low-rank regularizer.
Findings
Dropout is equivalent to a deterministic regularizer involving Euclidean norms.
Dropout achieves the global minimum in convex nuclear norm regularization.
Dropout can be viewed as data-dependent singular-value thresholding.
Abstract
Regularization for matrix factorization (MF) and approximation problems has been carried out in many different ways. Due to its popularity in deep learning, dropout has been applied also for this class of problems. Despite its solid empirical performance, the theoretical properties of dropout as a regularizer remain quite elusive for this class of problems. In this paper, we present a theoretical analysis of dropout for MF, where Bernoulli random variables are used to drop columns of the factors. We demonstrate the equivalence between dropout and a fully deterministic model for MF in which the factors are regularized by the sum of the product of squared Euclidean norms of the columns. Additionally, we inspect the case of a variable sized factorization and we prove that dropout achieves the global minimum of a convex approximation problem with (squared) nuclear norm regularization. As a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Tensor decomposition and applications
MethodsDropout
