Simple Projection-Free Algorithm for Contextual Recommendation with Logarithmic Regret and Robustness
Shinsaku Sakaue

TL;DR
This paper introduces a simple, efficient, projection-free algorithm for contextual recommendation that achieves logarithmic regret and is robust to suboptimal feedback, improving computational efficiency over prior methods.
Contribution
The paper proposes a novel, simpler algorithm that removes the Mahalanobis projection step, maintaining regret guarantees and robustness in contextual recommendation tasks.
Findings
Achieves $O(d\,\log T)$ regret bound with simpler updates.
Removes Mahalanobis projection, reducing computational complexity.
Maintains robustness to suboptimal action feedback.
Abstract
Contextual recommendation is a variant of contextual linear bandits in which the learner observes an (optimal) action rather than a reward scalar. Recently, Sakaue et al. (2025) developed an efficient Online Newton Step (ONS) approach with an regret bound, where is the dimension of the action space and is the time horizon. In this paper, we present a simple algorithm that is more efficient than the ONS-based method while achieving the same regret guarantee. Our core idea is to exploit the improperness inherent in contextual recommendation, leading to an update rule akin to the second-order perceptron from online classification. This removes the Mahalanobis projection step required by ONS, which is often a major computational bottleneck. More importantly, the same algorithm remains robust to possibly suboptimal action feedback, whereas the prior ONS-based method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
