Turning Stale Gradients into Stable Gradients: Coherent Coordinate Descent with Implicit Landscape Smoothing for Lightweight Zeroth-Order Optimization
Chen Liang, Xiatao Sun, Qian Wang, Daniel Rakita

TL;DR
This paper introduces Coherent Coordinate Descent (CoCD), a deterministic zeroth-order optimizer that improves sample efficiency and stability by leveraging implicit landscape smoothing and historical gradient coherence.
Contribution
The work formalizes gradient coherence, connects CoCD to block cyclic coordinate descent, and demonstrates implicit smoothing effects that enhance convergence in lightweight zeroth-order optimization.
Findings
CoCD outperforms BCCD in sample efficiency and convergence accuracy.
Larger finite-difference steps induce implicit smoothing, improving stability.
Deterministic updates surpass randomized methods in lightweight ZO optimization.
Abstract
Zeroth-Order (ZO) optimization is pivotal for scenarios where backpropagation is unavailable, such as memory-constrained on-device learning and black-box optimization. However, existing methods face a stark trade-off: they are either sample-inefficient (e.g., standard finite differences) or suffer from high variance due to randomized estimation (e.g., random subspace methods). In this work, we propose Coherent Coordinate Descent (CoCD), a deterministic, sample-efficient, and budget-aware ZO optimizer. Theoretically, we formalize the notion of gradient coherence and demonstrate that CoCD is equivalent to Block Cyclic Coordinate Descent (BCCD) with ``warm starts,'' effectively converting historical (stale) gradients from a liability into a computational asset. This mechanism enables query complexity per step while maintaining global descent directions. Furthermore, we derive error…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
