Guaranteeing Both Consensus and Optimality in Decentralized Nonconvex Optimization with Multiple Local Updates

Jie Liu; Zuang Wang; and Yongqiang Wang

arXiv:2511.05242·math.OC·November 10, 2025

Guaranteeing Both Consensus and Optimality in Decentralized Nonconvex Optimization with Multiple Local Updates

Jie Liu, Zuang Wang, and Yongqiang Wang

PDF

Open Access

TL;DR

This paper introduces MILE, a decentralized algorithm that guarantees consensus and optimality in nonconvex optimization with multiple local updates, reducing communication costs and applicable to large-scale machine learning problems.

Contribution

MILE is the first decentralized method to ensure both consensus and optimality under multiple local updates in nonconvex settings, with a novel analysis framework and minimal communication overhead.

Findings

01

Achieves $O(1/T)$ convergence rate with stochastic gradients

02

Requires only one variable exchange per agent pair

03

Demonstrates effectiveness on benchmark datasets

Abstract

Scalable decentralized optimization in large-scale systems hinges on efficient communication. A common way to reduce communication overhead is to perform multiple local updates between two communication rounds, as in federated learning. However, extending this strategy to fully decentralized settings poses fundamental challenges. Existing decentralized algorithms with multiple local updates guarantee accurate convergence only under strong convexity, limiting applicability to the nonconvex problems prevalent in machine learning. Moreover, many methods require exchanging and storing auxiliary variables, such as gradient-tracking vectors or correction terms, to ensure convergence under data heterogeneity, incurring high communication and memory costs. In this paper, we propose MILE, a fully decentralized algorithm that guarantees both consensus and optimality under multiple local updates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed Control Multi-Agent Systems · Stochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data