Computing Extremely Accurate Quantiles Using t-Digests
Ted Dunning, Otmar Ertl

TL;DR
This paper introduces t-digests, a new online algorithm for highly accurate quantile approximation near distribution tails, with small memory footprints and robustness to skewed data, suitable for real-time analytics.
Contribution
The paper presents a novel t-digest algorithm that achieves high-precision quantile estimates near distribution tails with small sketches and supports merging without accuracy loss.
Findings
High accuracy near distribution tails
Small sketch size for memory efficiency
Robustness to skewed and ordered data
Abstract
We present on-line algorithms for computing approximations of rank-based statistics that give high accuracy, particularly near the tails of a distribution, with very small sketches. Notably, the method allows a quantile to be computed with an accuracy relative to rather than absolute accuracy as with most other methods. This new algorithm is robust with respect to skewed distributions or ordered datasets and allows separately computed summaries to be combined with no loss in accuracy. An open-source Java implementation of this algorithm is available from the author. Independent implementations in Go and Python are also available.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Reservoir Engineering and Simulation Methods · Computability, Logic, AI Algorithms
