TL;DR
MCRapper introduces an efficient algorithm for computing Monte-Carlo Rademacher Averages in poset-structured families, enabling statistically sound pattern mining with improved accuracy and power over existing methods.
Contribution
The paper presents MCRapper, a novel algorithm that efficiently computes MCERA for poset families, unifying significance testing and high-expectation approximation in pattern mining.
Findings
MCRapper outperforms existing solutions in efficiency and accuracy.
TFP-R algorithm guarantees low false positive probability.
Experimental results show superior statistical power and performance.
Abstract
We present MCRapper, an algorithm for efficient computation of Monte-Carlo Empirical Rademacher Averages (MCERA) for families of functions exhibiting poset (e.g., lattice) structure, such as those that arise in many pattern mining tasks. The MCERA allows us to compute upper bounds to the maximum deviation of sample means from their expectations, thus it can be used to find both statistically-significant functions (i.e., patterns) when the available data is seen as a sample from an unknown distribution, and approximations of collections of high-expectation functions (e.g., frequent patterns) when the available data is a small sample from a large dataset. This feature is a strong improvement over previously proposed solutions that could only achieve one of the two. MCRapper uses upper bounds to the discrepancy of the functions to efficiently explore and prune the search space, a technique…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
