# DSPG: Decentralized Simultaneous Perturbations Gradient Descent Scheme

**Authors:** Arunselvan Ramaswamy

arXiv: 1903.07050 · 2019-08-28

## TL;DR

This paper introduces DSPG, a decentralized stochastic gradient method that effectively handles communication delays and non-differentiable objectives in multi-agent systems, improving convergence and implementation simplicity.

## Contribution

The paper proposes DSPG, a novel decentralized stochastic approximation algorithm that leverages old information and simple hyper-parameters to improve distributed optimization under communication constraints.

## Key findings

- DSPG effectively counters communication delays with hyper-parameter tuning.
- The variance analysis shows controlled convergence behavior.
- Numerical results validate the theoretical analysis.

## Abstract

Distributed descent-based methods are an essential toolset to solving optimization problems in multi-agent system scenarios. Here the agents seek to optimize a global objective function through mutual cooperation. Oftentimes, cooperation is achieved over a wireless communication network that is prone to delays and errors. There are many scenarios wherein the objective function is either non-differentiable or merely observable. In this paper, we present a cross-entropy based distributed stochastic approximation algorithm (SA) that finds a minimum of the objective, using only samples. We call this algorithm Decentralized Simultaneous Perturbation Stochastic Gradient, with Constant Sensitivity Parameters (DSPG). This algorithm is a two fold improvement over the classic Simultaneous Perturbation Stochastic Approximations (SPSA) algorithm. Specifically, DSPG allows for (i) the use of old information from other agents and (ii) easy implementation through the use simple hyper-parameter choices. We analyze the biases and variances that arise due to these two allowances. We show that the biases due to communication delays can be countered by a careful choice of algorithm hyper-parameters. The variance of the gradient estimator and its effect on the rate of convergence is studied. We present numerical results supporting our theory. Finally, we discuss an application to the stochastic consensus problem.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.07050/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1903.07050/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/1903.07050/full.md

---
Source: https://tomesphere.com/paper/1903.07050