A Group Theoretic Perspective on Unsupervised Deep Learning
Arnab Paul, Suresh Venkatasubramanian

TL;DR
This paper introduces a novel group-theoretic framework to understand how deep learning models learn simple features first and develop complex, higher-order representations through layer-wise pretraining, explaining the emergence of hierarchical features.
Contribution
It establishes a connection between pretraining in deep learning and group theory, particularly orbits and stabilizers, proposing shadow groups to analyze feature simplicity and hierarchy.
Findings
Pretraining aligns with searching for features with minimal group orbits.
Shadow groups approximate neural network transformations, revealing feature simplicity.
Deeper layers capture higher-order, more complex representations.
Abstract
Why does Deep Learning work? What representations does it capture? How do higher-order representations emerge? We study these questions from the perspective of group theory, thereby opening a new approach towards a theory of Deep learning. One factor behind the recent resurgence of the subject is a key algorithmic step called {\em pretraining}: first search for a good generative model for the input samples, and repeat the process one layer at a time. We show deeper implications of this simple principle, by establishing a connection with the interplay of orbits and stabilizers of group actions. Although the neural networks themselves may not form groups, we show the existence of {\em shadow} groups whose elements serve as close approximations. Over the shadow groups, the pre-training step, originally introduced as a mechanism to better initialize a network, becomes equivalent to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Neural Networks and Applications · Topological and Geometric Data Analysis
