Finding Skewed Subcubes Under a Distribution
Parikshit Gopalan, Roie Levin, Udi Wieder

TL;DR
This paper introduces a method to efficiently identify minimal skewed subcubes in high-dimensional distributions, revealing dependencies between variables, with applications in fairness and anomaly detection, using Fourier analysis and hypercontractivity.
Contribution
It defines a new notion of minimal skewed subcubes, provides bounds and algorithms for their enumeration, and establishes computational hardness results related to the sparse noisy parity problem.
Findings
Efficient algorithms for finding minimal skewed subcubes in Boolean hypercubes.
A Fourier-analytic bound on the list size of such subcubes.
Hardness results linking the problem to sparse noisy parity.
Abstract
Say that we are given samples from a distribution over an -dimensional space. We expect or desire to behave like a product distribution (or a -wise independent distribution over its marginals for small ). We propose the problem of enumerating/list-decoding all large subcubes where the distribution deviates markedly from what we expect; we refer to such subcubes as skewed subcubes. Skewed subcubes are certificates of dependencies between small subsets of variables in . We motivate this problem by showing that it arises naturally in the context of algorithmic fairness and anomaly detection. In this work we focus on the special but important case where the space is the Boolean hypercube, and the expected marginals are uniform. We show that the obvious definition of skewed subcubes can lead to intractable list sizes, and propose a better definition of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
