On subdifferential chain rule of matrix factorization and beyond
Jiewen Guan, Anthony Man-Cho So

TL;DR
This paper establishes conditions under which the Clarke subdifferential chain rule applies to matrix factorization problems, revealing that stationary points are trivial to compute in overparameterized settings and exploring extensions to tensors and neural networks.
Contribution
It provides the first comprehensive analysis of subdifferential chain rules for matrix factorization, especially in overparameterized regimes, and discusses implications for optimization and potential extensions.
Findings
Subdifferential chain rule holds in overparameterized matrix factorization.
Computing stationary points is trivial in these problems.
Extensions to tensors and neural networks are discussed but remain open.
Abstract
In this paper, we study equality-type Clarke subdifferential chain rules of matrix factorization and factorization machine. Specifically, we show for these problems that provided the latent dimension is larger than some multiple of the problem size (i.e., slightly overparameterized) and the loss function is locally Lipschitz, the subdifferential chain rules hold everywhere. In addition, we examine the tightness of the analysis through some interesting constructions and make some important observations from the perspective of optimization; e.g., we show that for all this type of problems, computing a stationary point is trivial. Some tensor generalizations and neural extensions are also discussed, albeit they remain mostly open.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · Advanced Topics in Algebra · graph theory and CDMA systems
