Shattering Thresholds for Random Systems of Sets, Words, and Permutations
Anant P. Godbole, Samantha Pinella, and Yan Zhuang

TL;DR
This paper investigates the thresholds at which random subsets, words, and permutations almost surely shatter all t-subsets, t-subwords, or permutation patterns, revealing sharp phase transitions using probabilistic and combinatorial tools.
Contribution
It introduces a unified framework for analyzing shattering thresholds across sets, words, and permutations, applying Talagrand's inequality to identify sharp probability thresholds.
Findings
Identifies sharp zero-one thresholds for shattering in various combinatorial structures.
Establishes probabilistic bounds for the number of random elements needed to shatter all t-subsets or patterns.
Demonstrates the applicability of isoperimetric inequalities in complex combinatorial threshold problems.
Abstract
This paper considers a problem that relates to the theories of covering arrays, permutation patterns, Vapnik-Chervonenkis (VC) classes, and probability thresholds. Specifically, we want to find the number of subsets of [n]:={1,2,....,n} we need to randomly select, in a certain probability space, so as to respectively "shatter" all t-subsets of [n]. Moving from subsets to words, we ask for the number of n-letter words on a q-letter alphabet that are needed to shatter all t-subwords of the q^n words of length n. Finally, we explore the number of random permutations of [n] needed to shatter (specializing to t=3), all length 3 permutation patterns in specified positions. We uncover a very sharp zero-one probability threshold for the emergence of such shattering; Talagrand's isoperimetric inequality in product spaces is used as a key tool.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Algorithms and Data Compression · Bayesian Methods and Mixture Models
