Online Newton Step Algorithm with Estimated Gradient

Binbin Liu; Jundong Li; Yunquan Song; Xijun Liang; Ling Jian; Huan; Liu

arXiv:1811.09955·cs.LG·March 18, 2019·1 cites

Online Newton Step Algorithm with Estimated Gradient

Binbin Liu, Jundong Li, Yunquan Song, Xijun Liang, Ling Jian, Huan, Liu

PDF

Open Access

TL;DR

This paper introduces ONSEG, a second-order online learning algorithm that leverages estimated gradients to achieve faster convergence in bandit feedback scenarios, improving regret bounds over previous methods.

Contribution

The paper develops ONSEG, extending the Online Newton Step algorithm with expected gradient estimation, reducing regret bounds from 0505/615 to 0502/315, and demonstrating empirical advantages.

Findings

01

ONSEG reduces expected regret from 0505/615 to 0502/315.

02

The algorithm outperforms existing methods on multiple real-world datasets.

03

Second-order information accelerates convergence in bandit online learning.

Abstract

Online learning with limited information feedback (bandit) tries to solve the problem where an online learner receives partial feedback information from the environment in the course of learning. Under this setting, Flaxman et al.[8] extended Zinkevich's classical Online Gradient Descent (OGD) algorithm [29] by proposing the Online Gradient Descent with Expected Gradient (OGDEG) algorithm. Specifically, it uses a simple trick to approximate the gradient of the loss function $f_{t}$ by evaluating it at a single point and bounds the expected regret as $O (T^{5/6})$ [8], where the number of rounds is $T$ . Meanwhile, past research efforts have shown that compared with the first-order algorithms, second-order online learning algorithms such as Online Newton Step (ONS) [11] can significantly accelerate the convergence rate of traditional online learning algorithms. Motivated by this,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Advanced Optimization Algorithms Research · Metaheuristic Optimization Algorithms Research

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings