Understanding Learning Invariance in Deep Linear Networks
Hao Duan, Guido Mont\'ufar

TL;DR
This paper provides a theoretical comparison of data augmentation, regularization, and hard-wiring for achieving invariance in deep linear networks, revealing their critical points and convergence properties.
Contribution
It offers the first theoretical analysis comparing these invariance methods in deep linear networks, highlighting their critical points and convergence behaviors.
Findings
Hard-wiring and data augmentation have identical critical points, only saddles and global minima.
Regularization introduces additional critical points, mostly saddles.
Regularization paths converge to hard-wired solutions.
Abstract
Equivariant and invariant machine learning models exploit symmetries and structural patterns in data to improve sample efficiency. While empirical studies suggest that data-driven methods such as regularization and data augmentation can perform comparably to explicitly invariant models, theoretical insights remain scarce. In this paper, we provide a theoretical comparison of three approaches for achieving invariance: data augmentation, regularization, and hard-wiring. We focus on mean squared error regression with deep linear networks, which parametrize rank-bounded linear maps and can be hard-wired to be invariant to specific group actions. We show that the critical points of the optimization problems for hard-wiring and data augmentation are identical, consisting solely of saddles and the global optimum. By contrast, regularization introduces additional critical points, though they…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsFocus
