Testing the Feasibility of Linear Programs with Bandit Feedback
Aditya Gangrade, Aditya Gopalan, Venkatesh Saligrama, Clayton Scott

TL;DR
This paper investigates the problem of testing the feasibility of unknown linear programs in the bandit setting, proposing a novel testing method with proven reliability and analyzing its sample complexity.
Contribution
It introduces the first feasibility testing method for linear bandits, characterizing its sample costs and establishing lower bounds, advancing understanding of constrained bandit problems.
Findings
Proposed a reliable test with sample costs scaling as d^2/\u03b3^2.
Established a minimax lower bound of d/b3^2 for sample complexity.
Connected feasibility testing to minimax game analysis and low-regret algorithms.
Abstract
While the recent literature has seen a surge in the study of constrained bandit problems, all existing methods for these begin by assuming the feasibility of the underlying problem. We initiate the study of testing such feasibility assumptions, and in particular address the problem in the linear bandit setting, thus characterising the costs of feasibility testing for an unknown linear program using bandit feedback. Concretely, we test if for an unknown , by playing a sequence of actions , and observing in response. By identifying the hypothesis as determining the sign of the value of a minimax game, we construct a novel test based on low-regret algorithms and a nonasymptotic law of iterated logarithms. We prove that this test is reliable, and adapts to the `signal level,' of any…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Advanced Control Systems Optimization · Machine Learning and Algorithms
