# Compressed Decentralized Proximal Stochastic Gradient Method for   Nonconvex Composite Problems with Heterogeneous Data

**Authors:** Yonggui Yan, Jie Chen, Pin-Yu Chen, Xiaodong Cui, Songtao Lu and, Yangyang Xu

arXiv: 2302.14252 · 2023-03-01

## TL;DR

This paper introduces a decentralized stochastic gradient method with compression for nonconvex composite problems, effectively handling heterogeneous data and achieving optimal sample complexity for training neural networks.

## Contribution

It proposes a novel decentralized proximal stochastic gradient tracking method with compression, improving communication efficiency and handling data heterogeneity in nonconvex optimization.

## Key findings

- Achieves optimal sample complexity for near-stationary points.
- Demonstrates better generalization in neural network training.
- Handles heterogeneous data effectively with gradient tracking.

## Abstract

We first propose a decentralized proximal stochastic gradient tracking method (DProxSGT) for nonconvex stochastic composite problems, with data heterogeneously distributed on multiple workers in a decentralized connected network. To save communication cost, we then extend DProxSGT to a compressed method by compressing the communicated information. Both methods need only $\mathcal{O}(1)$ samples per worker for each proximal update, which is important to achieve good generalization performance on training deep neural networks. With a smoothness condition on the expected loss function (but not on each sample function), the proposed methods can achieve an optimal sample complexity result to produce a near-stationary point. Numerical experiments on training neural networks demonstrate the significantly better generalization performance of our methods over large-batch training methods and momentum variance-reduction methods and also, the ability of handling heterogeneous data by the gradient tracking scheme.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.14252/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/2302.14252/full.md

## References

65 references — full list in the complete paper: https://tomesphere.com/paper/2302.14252/full.md

---
Source: https://tomesphere.com/paper/2302.14252