Testing the Feasibility of Linear Programs with Bandit Feedback

Aditya Gangrade; Aditya Gopalan; Venkatesh Saligrama; Clayton Scott

arXiv:2406.15648·cs.LG·June 25, 2024

Testing the Feasibility of Linear Programs with Bandit Feedback

Aditya Gangrade, Aditya Gopalan, Venkatesh Saligrama, Clayton Scott

PDF

Open Access

TL;DR

This paper investigates the problem of testing the feasibility of unknown linear programs in the bandit setting, proposing a novel testing method with proven reliability and analyzing its sample complexity.

Contribution

It introduces the first feasibility testing method for linear bandits, characterizing its sample costs and establishing lower bounds, advancing understanding of constrained bandit problems.

Findings

01

Proposed a reliable test with sample costs scaling as d^2/\u03b3^2.

02

Established a minimax lower bound of d/b3^2 for sample complexity.

03

Connected feasibility testing to minimax game analysis and low-regret algorithms.

Abstract

While the recent literature has seen a surge in the study of constrained bandit problems, all existing methods for these begin by assuming the feasibility of the underlying problem. We initiate the study of testing such feasibility assumptions, and in particular address the problem in the linear bandit setting, thus characterising the costs of feasibility testing for an unknown linear program using bandit feedback. Concretely, we test if $\exists x : A x \geq 0$ for an unknown $A \in R^{m \times d}$ , by playing a sequence of actions $x_{t} \in R^{d}$ , and observing $A x_{t} + noise$ in response. By identifying the hypothesis as determining the sign of the value of a minimax game, we construct a novel test based on low-regret algorithms and a nonasymptotic law of iterated logarithms. We prove that this test is reliable, and adapts to the `signal level,' $Γ,$ of any…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Advanced Control Systems Optimization · Machine Learning and Algorithms