COCO Denoiser: Using Co-Coercivity for Variance Reduction in Stochastic Convex Optimization
Manuel Madeira, Renato Negrinho, Jo\~ao Xavier, Pedro M. Q. Aguiar

TL;DR
This paper introduces COCO Denoiser, a variance reduction technique for stochastic convex optimization that leverages convexity and smoothness to improve gradient estimates, enhancing existing algorithms' performance.
Contribution
The paper proposes COCO Denoiser, a novel estimator based on co-coercivity constraints, and demonstrates its effectiveness in reducing variance and improving stochastic optimization algorithms.
Findings
COCO Denoiser improves gradient estimates in stochastic optimization.
Applying COCO to algorithms like SGD and Adam enhances their performance.
Empirical results show better convergence with COCO in various settings.
Abstract
First-order methods for stochastic optimization have undeniable relevance, in part due to their pivotal role in machine learning. Variance reduction for these algorithms has become an important research topic. In contrast to common approaches, which rarely leverage global models of the objective function, we exploit convexity and L-smoothness to improve the noisy estimates outputted by the stochastic gradient oracle. Our method, named COCO denoiser, is the joint maximum likelihood estimator of multiple function gradients from their noisy observations, subject to co-coercivity constraints between them. The resulting estimate is the solution of a convex Quadratically Constrained Quadratic Problem. Although this problem is expensive to solve by interior point methods, we exploit its structure to apply an accelerated first-order algorithm, the Fast Dual Proximal Gradient method. Besides…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Bandit Algorithms Research
MethodsAdam · Stochastic Gradient Descent
