# Learning Orthogonal Projections in Linear Bandits

**Authors:** Qiyu Kang, Wee Peng Tay

arXiv: 1906.10981 · 2019-10-25

## TL;DR

This paper introduces a novel approach to learning the best arm in a linear stochastic bandit model where the reward is based on the projection of the arm onto a subspace, addressing unobservable projection rewards and demonstrating effective regret bounds.

## Contribution

The paper proposes a new strategy for linear bandits with unobservable projection rewards, achieving near-optimal regret bounds for finite and infinite arm sets.

## Key findings

- Achieves $O(|bD|\log n)$ regret for finite arms.
- Achieves $O(n^{2/3}(\log n)^{1/2})$ regret for infinite compact sets.
- Experimental results confirm the strategy's efficiency.

## Abstract

In a linear stochastic bandit model, each arm is a vector in an Euclidean space and the observed return at each time step is an unknown linear function of the chosen arm at that time step. In this paper, we investigate the problem of learning the best arm in a linear stochastic bandit model, where each arm's expected reward is an unknown linear function of the projection of the arm onto a subspace. We call this the projection reward. Unlike the classical linear bandit problem in which the observed return corresponds to the reward, the projection reward at each time step is unobservable. Such a model is useful in recommendation applications where the observed return includes corruption by each individual's biases, which we wish to exclude in the learned model. In the case where there are finitely many arms, we develop a strategy to achieve $O(|\bbD|\log n)$ regret, where $n$ is the number of time steps and $|\bbD|$ is the number of arms. In the case where each arm is chosen from an infinite compact set, our strategy achieves $O(n^{2/3}(\log{n})^{1/2})$ regret. Experiments verify the efficiency of our strategy.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.10981/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1906.10981/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/1906.10981/full.md

---
Source: https://tomesphere.com/paper/1906.10981