Consistent Sparse Deep Learning: Theory and Computation

Yan Sun; Qifan Song; Faming Liang

arXiv:2102.13229·stat.ML·March 9, 2021·6 cites

Consistent Sparse Deep Learning: Theory and Computation

Yan Sun, Qifan Song, Faming Liang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a theoretically justified, efficient method for learning sparse deep neural networks that improves interpretability and performance in large-scale applications.

Contribution

It proposes a new frequentist-like approach for sparse DNNs with Bayesian consistency guarantees and practical algorithms for network compression and variable selection.

Findings

01

Achieves posterior consistency with a mixture Gaussian prior.

02

Effectively determines network structure using Laplace approximation.

03

Performs well in large-scale network compression and variable selection.

Abstract

Deep learning has been the engine powering many successes of data science. However, the deep neural network (DNN), as the basic model of deep learning, is often excessively over-parameterized, causing many difficulties in training, prediction and interpretation. We propose a frequentist-like method for learning sparse DNNs and justify its consistency under the Bayesian framework: the proposed method could learn a sparse DNN with at most $O (n / lo g (n))$ connections and nice theoretical guarantees such as posterior consistency, variable selection consistency and asymptotically optimal generalization bounds. In particular, we establish posterior consistency for the sparse DNN with a mixture Gaussian prior, show that the structure of the sparse DNN can be consistently determined using a Laplace approximation-based marginal posterior inclusion probability approach, and use Bayesian evidence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sylydya/Consistent-Sparse-Deep-Learning-Theory-and-Computation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis