Policy Design for Active Sequential Hypothesis Testing using Deep Learning
Dhruva Kartik, Ekraam Sabir, Urbashi Mitra, Prem Natarajan

TL;DR
This paper introduces deep learning-based heuristics for active sequential hypothesis testing, aiming to efficiently identify the true hypothesis with fewer samples, outperforming existing methods in certain scenarios.
Contribution
It proposes two novel heuristics using deep reinforcement learning and KL-divergence games for POMDP-based hypothesis testing, improving over current solutions.
Findings
Deep RL heuristic outperforms traditional methods in sample efficiency.
KL-divergence game heuristic achieves better accuracy in specific scenarios.
Numerical experiments validate the effectiveness of proposed heuristics.
Abstract
Information theory has been very successful in obtaining performance limits for various problems such as communication, compression and hypothesis testing. Likewise, stochastic control theory provides a characterization of optimal policies for Partially Observable Markov Decision Processes (POMDPs) using dynamic programming. However, finding optimal policies for these problems is computationally hard in general and thus, heuristic solutions are employed in practice. Deep learning can be used as a tool for designing better heuristics in such problems. In this paper, the problem of active sequential hypothesis testing is considered. The goal is to design a policy that can reliably infer the true hypothesis using as few samples as possible by adaptively selecting appropriate queries. This problem can be modeled as a POMDP and bounds on its value function exist in literature. However,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Optimization and Search Problems · Machine Learning and Algorithms
