Learning Sparse Structured Ensembles with SG-MCMC and Network Pruning
Yichi Zhang, Zhijian Ou

TL;DR
This paper introduces a two-stage method combining SG-MCMC, group sparse priors, and pruning to efficiently learn neural network ensembles with high accuracy and reduced computational costs.
Contribution
It is the first to integrate SG-MCMC, group sparse priors, and pruning for neural network ensemble learning, improving efficiency and accuracy.
Findings
Achieved 21% reduction in language model perplexity.
Reduced model parameters to 30% of the original.
Lowered computation costs by 70% in ensemble models.
Abstract
An ensemble of neural networks is known to be more robust and accurate than an individual network, however usually with linearly-increased cost in both training and testing. In this work, we propose a two-stage method to learn Sparse Structured Ensembles (SSEs) for neural networks. In the first stage, we run SG-MCMC with group sparse priors to draw an ensemble of samples from the posterior distribution of network parameters. In the second stage, we apply weight-pruning to each sampled network and then perform retraining over the remained connections. In this way of learning SSEs with SG-MCMC and pruning, we not only achieve high prediction accuracy since SG-MCMC enhances exploration of the model-parameter space, but also reduce memory and computation cost significantly in both training and testing of NN ensembles. This is thoroughly evaluated in the experiments of learning SSE ensembles…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Domain Adaptation and Few-Shot Learning · Neural Networks and Applications
MethodsPruning · Sigmoid Activation · Tanh Activation · Long Short-Term Memory
