A Novel Stochastic Gradient Descent Algorithm for Learning Principal Subspaces
Charline Le Lan, Joshua Greaves, Jesse Farebrother, Mark Rowland,, Fabian Pedregosa, Rishabh Agarwal, Marc G. Bellemare

TL;DR
This paper introduces a new stochastic gradient descent algorithm that efficiently learns principal subspaces from sample entries, suitable for large-scale datasets and neural network representations, with theoretical guarantees and practical experiments.
Contribution
It develops a novel algorithm for learning principal subspaces from sample entries that can be integrated with neural networks and scaled to large datasets.
Findings
Algorithm effectively learns principal subspaces from partial data.
Theoretical analysis guarantees bias control in gradient estimates.
Experimental results demonstrate success on synthetic, image, and reinforcement learning data.
Abstract
Many machine learning problems encode their data as a matrix with a possibly very large number of rows and columns. In several applications like neuroscience, image compression or deep reinforcement learning, the principal subspace of such a matrix provides a useful, low-dimensional representation of individual data. Here, we are interested in determining the -dimensional principal subspace of a given matrix from sample entries, i.e. from small random submatrices. Although a number of sample-based methods exist for this problem (e.g. Oja's rule \citep{oja1982simplified}), these assume access to full columns of the matrix or particular matrix structure such as symmetry and cannot be combined as-is with neural networks \citep{baldi1989neural}. In this paper, we derive an algorithm that learns a principal subspace from sample entries, can be applied when the approximate subspace is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Face and Expression Recognition · Stochastic Gradient Optimization Techniques
