CONGO: Compressive Online Gradient Optimization

Jeremy Carleton; Prathik Vijaykumar; Divyanshu Saxena; Dheeraj Narasimha; Srinivas Shakkottai; Aditya Akella

arXiv:2407.06325·cs.LG·May 19, 2025

CONGO: Compressive Online Gradient Optimization

Jeremy Carleton, Prathik Vijaykumar, Divyanshu Saxena, Dheeraj Narasimha, Srinivas Shakkottai, Aditya Akella

PDF

Open Access 1 Video 3 Reviews

TL;DR

CONGO introduces a compressive sensing-based framework for zeroth-order online convex optimization that exploits gradient sparsity to improve sample efficiency and achieve optimal regret bounds, demonstrated through simulations and microservices benchmarks.

Contribution

The paper proposes the CONGO framework, applying compressive sensing to online optimization with sparse gradients, reducing sample complexity and improving regret bounds.

Findings

01

CONGO outperforms traditional gradient descent methods in sparse settings.

02

Sample complexity scales with gradient sparsity, not full dimension.

03

Numerical and real-world benchmarks validate the approach.

Abstract

We address the challenge of zeroth-order online convex optimization where the objective function's gradient exhibits sparsity, indicating that only a small number of dimensions possess non-zero gradients. Our aim is to leverage this sparsity to obtain useful estimates of the objective function's gradient even when the only information available is a limited number of function samples. Our motivation stems from the optimization of large-scale queueing networks that process time-sensitive jobs. Here, a job must be processed by potentially many queues in sequence to produce an output, and the service time at any queue is a function of the resources allocated to that queue. Since resources are costly, the end-to-end latency for jobs must be balanced with the overall cost of the resources used. While the number of queues is substantial, the latency function primarily reacts to resource…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 8Confidence 3

Strengths

The paper is overall well written, and presents the setup, results, and proof clearly. The algorithms proposed appear efficient in terms of the regret and the sampling complexity.

Weaknesses

-- One might argue that the results are not too surprising: the regret follows from the regret of online gradient descent, while the sampling complexity follows from the compressive sensing results. -- In CONGO-B, line 827 – 829, gradient recovery requires solving an LP, which can be computationally inefficient, especially in high-dimensional setting. In addition, compressive sensing usually requires knowledge of the sparsity level before setting the number of samples. If such knowledge is lac

Reviewer 02Rating 6Confidence 3

Strengths

This is a well-written paper in general, the idea of introducing compressed sensing for estimating the gradients is very inspiring. The numerical performance of the proposed scheme is excellent. The presentation of the paper is very clear and easy to read.

Weaknesses

The novelty of the proposed scheme may be potentially limited (rebuttal against this point is welcomed as the reviewer is not familiar with zeroth-order optimization literature). The reviewer has seen similar approach been proposed in Wang et al, "Stochastic zeroth-order optimization in high dimensions" AISTATS'18, where they utilized a very similar idea but used LASSO (L_1) instead of CoSAMP (L_0). The numerical study did not considered this AISTATS'18 paper as a baseline, although being cited

Reviewer 03Rating 8Confidence 4

Strengths

1. The use of compressive sensing within an OCO framework is a fresh and well-motivated idea. By focusing on sparse gradients, the authors address both sample efficiency and dimensionality reduction, which are critical in high-dimensional settings. 2. The authors provide rigorous theoretical analysis, establishing regret bounds that demonstrate sublinear scaling with respect to the problem horizon, independent of the problem dimension. 3. The three algorithmic variants, CONGO-B, CONGO-Z, and

Weaknesses

1. The theoretical analysis assumes exact gradient sparsity, which may not be realistic for all real-world problems. 2. While CONGO outperforms standard gradient descent with SPSA, it’s mostly compared against methods that don’t leverage sparsity. A more comprehensive comparison with advanced sparse optimization techniques or regularized gradient estimators would help here.

Videos

CONGO: Compressive Online Gradient Optimization· slideslive

Taxonomy

TopicsCloud Computing and Resource Management · IoT and Edge/Fog Computing · Age of Information Optimization

Methodstravel james