Scalable Multivariate Histograms
Raazesh Sainudiin, Warwick Tucker, Tilo Wiklund

TL;DR
This paper introduces a distributed adaptive histogram estimation method that enables processing of larger datasets, maintaining desirable statistical properties, with a prototype implementation available.
Contribution
It presents a scalable, distributed version of an existing adaptive histogram estimation procedure based on regular pavings, expanding its applicability to larger datasets.
Findings
Enables processing of significantly larger datasets.
Maintains statistical and arithmetical properties of the original method.
Provides a prototype implementation under a permissive license.
Abstract
We give a distributed variant of an adaptive histogram estimation procedure previously developed by the first author. The procedure is based on regular pavings and is known to have numerous appealing statistical and arithmetical properties. The distributed version makes it possible to process data sets significantly bigger than previously. We provide prototype implementation under a permissive license.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNumerical Methods and Algorithms · Neural Networks and Applications · Gaussian Processes and Bayesian Inference
