# $S^{2}$-LBI: Stochastic Split Linearized Bregman Iterations for   Parsimonious Deep Learning

**Authors:** Yanwei Fu, Donghao Li, Xinwei Sun, Shun Zhang, Yizhou Wang, Yuan Yao

arXiv: 1904.10873 · 2019-04-25

## TL;DR

This paper introduces $S^{2}$-LBI, a stochastic iterative algorithm that efficiently trains deep networks by promoting structural sparsity, enabling network simplification or enhancement with theoretical and empirical validation on multiple datasets.

## Contribution

The paper presents a novel $S^{2}$-LBI algorithm that combines efficiency and model selection consistency for training sparse deep networks, with a solution path for network size adjustment.

## Key findings

- Achieved high accuracy with significantly reduced parameters on MNIST.
- Successfully simplified LeNet-5 by 82.5% parameters while maintaining accuracy.
- Validated effectiveness on MNIST and CIFAR-10 datasets.

## Abstract

This paper proposes a novel Stochastic Split Linearized Bregman Iteration ($S^{2}$-LBI) algorithm to efficiently train the deep network. The $S^{2}$-LBI introduces an iterative regularization path with structural sparsity. Our $S^{2}$-LBI combines the computational efficiency of the LBI, and model selection consistency in learning the structural sparsity. The computed solution path intrinsically enables us to enlarge or simplify a network, which theoretically, is benefited from the dynamics property of our $S^{2}$-LBI algorithm. The experimental results validate our $S^{2}$-LBI on MNIST and CIFAR-10 dataset. For example, in MNIST, we can either boost a network with only 1.5K parameters (1 convolutional layer of 5 filters, and 1 FC layer), achieves 98.40\% recognition accuracy; or we simplify $82.5\%$ of parameters in LeNet-5 network, and still achieves the 98.47\% recognition accuracy. In addition, we also have the learning results on ImageNet, which will be added in the next version of our report.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.10873/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1904.10873/full.md

## References

47 references — full list in the complete paper: https://tomesphere.com/paper/1904.10873/full.md

---
Source: https://tomesphere.com/paper/1904.10873