Online Algorithm for Unsupervised Sequential Selection with Contextual   Information

Arun Verma; Manjesh K. Hanawal; Csaba Szepesv\'ari; Venkatesh; Saligrama

arXiv:2010.12353·cs.LG·October 26, 2020·1 cites

Online Algorithm for Unsupervised Sequential Selection with Contextual Information

Arun Verma, Manjesh K. Hanawal, Csaba Szepesv\'ari, Venkatesh, Saligrama

PDF

Open Access 1 Video

TL;DR

This paper introduces a new variant of the stochastic contextual bandits problem called Contextual Unsupervised Sequential Selection (USS), where the loss cannot be directly observed, and proposes an algorithm with sub-linear regret under certain conditions.

Contribution

The paper formulates the USS problem with fixed costs and sequential arm selection, and develops an algorithm that achieves sub-linear regret under the CWD property.

Findings

01

The proposed algorithm performs well on synthetic datasets.

02

Experiments on real datasets validate the effectiveness of the approach.

03

Learning is feasible under the CWD property despite unsupervised feedback.

Abstract

In this paper, we study Contextual Unsupervised Sequential Selection (USS), a new variant of the stochastic contextual bandits problem where the loss of an arm cannot be inferred from the observed feedback. In our setup, arms are associated with fixed costs and are ordered, forming a cascade. In each round, a context is presented, and the learner selects the arms sequentially till some depth. The total cost incurred by stopping at an arm is the sum of fixed costs of arms selected and the stochastic loss associated with the arm. The learner's goal is to learn a decision rule that maps contexts to arms with the goal of minimizing the total expected loss. The problem is challenging as we are faced with an unsupervised setting as the total loss cannot be estimated. Clearly, learning is feasible only if the optimal arm can be inferred (explicitly or implicitly) from the problem structure. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Online Algorithm for Unsupervised Sequential Selection with Contextual Information· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Machine Learning and Algorithms