Geometry of the Minimum Volume Confidence Sets
Heguang Lin, Mengze Li, Daniel Pimentel-Alarc\'on, Matthew Malloy

TL;DR
This paper investigates the geometric structure of minimum-volume confidence sets for multinomial parameters, aiming to improve the understanding and computation of these sets to enhance data science and machine learning applications.
Contribution
It provides a geometric analysis of the confidence sets by enumerating and covering regions of the exact p-value function, addressing key computational challenges.
Findings
Characterization of the geometry of confidence set level-sets
Methods for enumerating continuous regions of the p-value function
Insights into disjointness of confidence sets for multinomial outcomes
Abstract
Computation of confidence sets is central to data science and machine learning, serving as the workhorse of A/B testing and underpinning the operation and analysis of reinforcement learning algorithms. This paper studies the geometry of the minimum-volume confidence sets for the multinomial parameter. When used in place of more standard confidence sets and intervals based on bounds and asymptotic approximation, learning algorithms can exhibit improved sample complexity. Prior work showed the minimum-volume confidence sets are the level-sets of a discontinuous function defined by an exact p-value. While the confidence sets are optimal in that they have minimum average volume, computation of membership of a single point in the set is challenging for problems of modest size. Since the confidence sets are level-sets of discontinuous functions, little is apparent about their geometry. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Imbalanced Data Classification Techniques
