Stacked unsupervised learning with a network architecture found by supervised meta-learning
Kyle Luther, H. Sebastian Seung

TL;DR
This paper introduces a biologically plausible stacked unsupervised learning algorithm that achieves high accuracy on MNIST by using architecture-driven prior knowledge and meta-learning to optimize hyperparameters.
Contribution
The authors develop a novel SUL algorithm that performs comparably to backpropagation-based methods on MNIST, using architecture-based priors and meta-learning for hyperparameter tuning.
Findings
Achieves MNIST clustering accuracy comparable to backpropagation methods.
Uses architecture inspired by visual cortex energy models and meta-learning for hyperparameter optimization.
Outperformed by only self-supervised methods requiring data augmentation.
Abstract
Stacked unsupervised learning (SUL) seems more biologically plausible than backpropagation, because learning is local to each layer. But SUL has fallen far short of backpropagation in practical applications, undermining the idea that SUL can explain how brains learn. Here we show an SUL algorithm that can perform completely unsupervised clustering of MNIST digits with comparable accuracy relative to unsupervised algorithms based on backpropagation. Our algorithm is exceeded only by self-supervised methods requiring training data augmentation by geometric distortions. The only prior knowledge in our unsupervised algorithm is implicit in the network architecture. Multiple convolutional "energy layers" contain a sum-of-squares nonlinearity, inspired by "energy models" of primary visual cortex. Convolutional kernels are learned with a fast minibatch implementation of the K-Subspaces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural dynamics and brain function · Domain Adaptation and Few-Shot Learning · CCD and CMOS Imaging Sensors
