# ADMM for Efficient Deep Learning with Global Convergence

**Authors:** Junxiang Wang, Fuxun Yu, Xiang Chen, Liang Zhao

arXiv: 1905.13611 · 2021-07-07

## TL;DR

This paper introduces dlADMM, a novel ADMM-based optimization framework for deep learning that guarantees global convergence, improves efficiency, and outperforms existing methods on benchmark datasets.

## Contribution

The paper presents the first global convergence proof for ADMM in deep learning and reduces computational complexity from cubic to quadratic in feature dimensions.

## Key findings

- dlADMM outperforms most comparison methods on benchmarks.
- Reduces time complexity from cubic to quadratic.
- Provides the first global convergence proof for ADMM in deep neural networks.

## Abstract

Alternating Direction Method of Multipliers (ADMM) has been used successfully in many conventional machine learning applications and is considered to be a useful alternative to Stochastic Gradient Descent (SGD) as a deep learning optimizer. However, as an emerging domain, several challenges remain, including 1) The lack of global convergence guarantees, 2) Slow convergence towards solutions, and 3) Cubic time complexity with regard to feature dimensions. In this paper, we propose a novel optimization framework for deep learning via ADMM (dlADMM) to address these challenges simultaneously. The parameters in each layer are updated backward and then forward so that the parameter information in each layer is exchanged efficiently. The time complexity is reduced from cubic to quadratic in (latent) feature dimensions via a dedicated algorithm design for subproblems that enhances them utilizing iterative quadratic approximations and backtracking. Finally, we provide the first proof of global convergence for an ADMM-based method (dlADMM) in a deep neural network problem under mild conditions. Experiments on benchmark datasets demonstrated that our proposed dlADMM algorithm outperforms most of the comparison methods.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.13611/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/1905.13611/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1905.13611/full.md

---
Source: https://tomesphere.com/paper/1905.13611