Confidence Intervals for Unobserved Events

Amichai Painsky

arXiv:2211.03052·math.ST·November 8, 2022·1 cites

Confidence Intervals for Unobserved Events

Amichai Painsky

PDF

Open Access

TL;DR

This paper introduces a new, dimension-free confidence interval method for estimating probabilities of unobserved events in finite samples, demonstrating superior performance in synthetic, real-world, and large alphabet scenarios.

Contribution

The work presents a novel, dimension-free confidence interval framework for unobserved events, with tight bounds and improved performance over existing methods.

Findings

01

Dimension-free confidence intervals are nearly tight.

02

Proposed scheme outperforms existing methods in experiments.

03

Effective for large alphabet modeling.

Abstract

Consider a finite sample from an unknown distribution over a countable alphabet. Unobserved events are alphabet symbols which do not appear in the sample. Estimating the probabilities of unobserved events is a basic problem in statistics and related fields, which was extensively studied in the context of point estimation. In this work we introduce a novel interval estimation scheme for unobserved events. Our proposed framework applies selective inference, as we construct confidence intervals (CIs) for the desired set of parameters. Interestingly, we show that obtained CIs are dimension-free, as they do not grow with the alphabet size. Further, we show that these CIs are (almost) tight, in the sense that they cannot be further improved without violating the prescribed coverage rate. We demonstrate the performance of our proposed scheme in synthetic and real-world experiments, showing a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Bayesian Modeling and Causal Inference