Shatter functions with polynomial growth rates
Boris Bukh, Xavier Goaoc

TL;DR
This paper investigates how a specific value of the shatter function influences the overall growth rate of set systems, challenging existing conjectures and expanding understanding of combinatorial set theory.
Contribution
It provides new insights into the relationship between shatter function values and growth rates, and refutes a conjecture extending Sauer's Lemma.
Findings
Refutes a conjecture of Bondy and Hajnal
Establishes bounds on growth rates based on shatter function values
Enhances understanding of polynomial growth in set systems
Abstract
We study how a single value of the shatter function of a set system restricts its asymptotic growth. Along the way, we refute a conjecture of Bondy and Hajnal which generalizes Sauer's Lemma.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Shatter functions with polynomial growth rates
Boris Bukh Department of Mathematical Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA. Supported in part by Sloan Research Fellowship and by U.S. taxpayers through NSF grant DMS-1301548 and through NSF CAREER grant DMS-1555149.
Xavier Goaoc Université Paris-Est, LIGM (UMR 8049), CNRS, ENPC, ESIEE, UPEM, F-77454, Marne-la-Vallée, France. Supported by Institut Universitaire de France.
Abstract
We study how a single value of the shatter function of a set system restricts its asymptotic growth. Along the way, we refute a conjecture of Bondy and Hajnal which generalizes Sauer’s Lemma.
1 Introduction
A standard tool in combinatorial and computational geometry is the shatter function of a (geometric) set system . By set system we mean a family of subsets of a ground set . The trace of a set system on a subset is defined as
[TABLE]
and the shatter function of is
[TABLE]
where denotes the cardinality of a set . The survey of Matoušek [Mat98] details several geometric and algorithmic applications of shatter functions. The asymptotic growth rate of a shatter function is often its most important feature.
In this paper, we study how the growth rate of a shatter function can be controlled by fixing one of its values. For example, a classical lemma of Sauer [Sau72] (and Vapnik and Chervonenkis [VC71] and Shelah [She72]) asserts that if is at most then , for any natural number . In particular, the growth of a shatter function exhibits a dichotomy: either for all , or is bounded by a polynomial. We will be concerned below with conditions that ensure precise polynomial growth rates.
New results.
Let denote the largest integer such that every set system with satisfies . We prove the following bounds.
Theorem 1**.**
For any integers ,
[TABLE]
We also obtain an analogous result for non-integral values of . The inequalities are more cumbersome though, see Corollary 3 and Lemma 4 for the lower and upper bounds respectively.
We establish the upper bound by a probabilistic construction (Section 2). Interestingly, this upper bound already refutes a conjecture of Bondy and Hajnal [Bon72], see also [FP94, Problem 3.3], regarding a generalization of Sauer’s Lemma: they conjectured that if then for any large enough111Here “large enough” depends solely on and , and not on the set system . The original statement of the conjecture [Bon72] was without this precaution. According to Füredi and Pach [FP94, Problem 3.3], the necessity of allowing exceptional values is due to Frankl. Bollobás and Radcliffe [BR95, Theorem 11] also gave a probabilistic construction showing that must be larger than . , where
[TABLE]
The case is Sauer’s Lemma and the conjecture was also proven for [BR95, Theorem 5]. If it were true, the Bondy–Hajnal conjecture would have implied that grows at least as fast as , which is what our upper bound prevents.
We obtain the lower bound by analyzing the density of certain patterns in simplicial complexes (Section 3). This builds on the argument of Bukh and Conlon [BC15] to bound the number of edges of graphs avoiding certain subgraphs. Theorem 2 below specializes to the lower bound in Theorem 1 for .
Theorem 2**.**
Let be a rational number, let be its denominator and let . Let be an integer. For any set system ,
[TABLE]
As real numbers can be approximated arbitrarily well by rational numbers, it is easy to extend the preceding theorem to irrational .
Corollary 3**.**
Let be a real number, and let . Let be an integer. For any set system ,
[TABLE]
Proof.
Let q=\bigl{\lceil}(1/s)\sqrt{m/\log_{2}s}\bigr{\rceil}. Let be the largest rational number of denominator such that . Since , we have . Also since , we have . Hence . ∎
Related work.
The only previous lower bound is due to Cheong et al. [CGN13, Theorem 1], who proved that by adapting the inductive proof of Sauer’s lemma. The only upper bound that we are aware of is the easy , for instance we may split into almost equal parts, and let consist of those -sets that contain one vertex from each part.
The argument of Bukh and Conlon was extended from graphs to hypergraphs by Fitch [Fit16]. Both his and our work use generalization of balanced rooted trees from the work of Bukh and Conlon. There are technical differences, though. Fitch works with uniform hypergraphs, whereas we work with simplicial complexes, which results in a slightly different notion of density. Our construction (Proposition 8) uses a different idea from his in [Fit16, Lemma 1].
2 Random simplicial complexes
Recall that a simplicial complex is a set system closed under taking subsets.222This is usually called an abstract simplicial complex but since we consider no embedded simplicial complex in this paper, we do not feel the need to emphasize the distinction. In Lemma 4 we present a construction of random simplicial complexes that implies the upper bound of Theorem 1.
Lemma 4**.**
For any real number and for each integer , for arbitrarily large there exists a set system on vertices, with and , where .
Proof.
Fix . For any large enough, we build a random -dimensional simplicial complex on vertices by examining each subset of up to vertices in the order of increasing size. For each subset , if all form faces of our complex, we turn into a face with probability . All choices are independent, and the complex is initialized with all vertices.
Adding a -dimensional face to requires to add each of its proper faces of dimension or more, plus the -face itself. The expected number of faces of dimension of is thus
[TABLE]
Let denote the expected number of faces of . Note that since , we have .
Let . Call an -element set “bad” if the set contains at least faces of dimension or more. Since there are at most complexes on any given set of vertices, the expected number of bad -sets is at most
[TABLE]
Let be the complex obtained from by removing vertices of all bad -sets. Accounting for the traces of size [math] and , we have
[TABLE]
As each vertex belongs to at most faces of , the expected number of faces in is at least
[TABLE]
So there exists a complex on at most vertices with at least faces and . We can ensure that the complex has exactly vertices by adding dummy vertices if necessary. ∎
For any , Lemma 4 with shows that
[TABLE]
Taking , the upper bound of Theorem 1 follows.
Remark*.*
For most geometric set systems the bound in Sauer’s lemma is not sharp. This includes the family of halfspaces in . In fact, for this family no shatter condition implies the correct bound.
To see this we may make probabilistic construction similar to that of Lemma 4. Start with the complete -dimensional skeleton of the -dimensional simplex, add every -simplex randomly and independently, each with probability with and , then delete every -simplex supported on a -element subset of vertices that spans at least -simplices. With positive probability, the resulting random simplicial complex satisfies
[TABLE]
Considering points on the moment curve shows that the set system of halfspaces in violates this shatter condition for every .
3 Proof of Theorem 2
We first remark that in proving upper bounds on we may restrict ourselves to simplicial complexes, since any set system can be “compressed” without changing its number of sets nor increasing its shatter function.
Lemma 5** (Alon [Alo83] and Frankl [Fra83]).**
For any finite set system there exists an abstract simplicial complex with and .
We write for the set of vertices of a simplicial complex . Two simplices are nonadjacent in , if they are vertex disjoint, and there is no edge intersecting both and . A set of pairwise nonadjacent vertices is called an independent set. For a complex , the degree of a -simplex is the number of -simplices is contained in. We denote by the minimum degree of any -simplex in . We define the density of a subset of vertices of a simplicial complex to be , where is the number of non-empty simplices in with at least one vertex in .
3.1 Balanced rooted -trees and shatter functions
A -tree is defined inductively. First, a -simplex is a -tree. If is a -tree, and is a -simplex, then the complex obtained by gluing to a -simplex formed by and a new vertex is also a -tree. We say that is obtained by attaching a vertex to .
A rooted -tree consists of a -tree together with a distinguished -simplex and an independent set such that is nonadjacent to each of the vertices in . We call vertex roots and the simplex root of . The rest of the vertices we call unrooted.
A rooted -tree with three root vertices
The min-density of a rooted -tree is the minimum of over all non-empty sets of unrooted vertices. If the minimum is attained by , the set of all unrooted vertices, then we call the tree balanced. We use balanced rooted trees to bound from below the shatter function of simplicial complexes as follows.
Lemma 6**.**
Let be integers. Suppose is a balanced -tree with facets and vertex roots. Every simplicial complex on vertices with contains vertices that span at least simplices.
Proof.
Let be the -tree in question, and let be a simplicial complex on vertices with . We assume that , for otherwise the result is trivially true, and argue that some -element set spans at least simplices.
Fix an arbitrary -simplex of . Consider copies of such that is mapped to and different -simplices of are mapped to different -simplices of . Each such copy can be obtained by embedding facets of one-by-one starting with the facet containing the root. Since has facets, then there are at least copies of the tree such that is mapped to . Since , it follows that . The pigeonhole principle ensures that some of these copies have the same vertex roots; denote them by .
Let . As is balanced, has at least simplices, and at least of the simplices spanned by use a vertex from . Thus, by induction on , each spans at least simplices in .
The set contains at least copies of with prescribed roots, so and it follows that . Since , there is such that . Setting we obtain the result. ∎
A similar argument permits us to control the overlap of -simplices.
Lemma 7**.**
Suppose is a simplicial complex, and is a -simplex in which is contained in simplices of dimension . Then contains vertices that span at least \min\bigl{(}N,\frac{2^{d+1}-2^{d^{\prime}+1}}{d-d^{\prime}}(m-d)\bigr{)} simplices.
Proof.
If these simplices are contained in some -element set, then we are obviously done. Otherwise, we can find -simplices whose union is of size between and . Let .
The number of simplices contained in that are not contained in is
[TABLE]
By induction on , it follows that the number of simplices spanned by is at least . ∎
3.2 Construction of balanced -trees of prescribed rational density
We now prove that balanced -trees of every rational density exceeding exist (Proposition 8). The case was previously handled in [BC15], and the following construction borrows some ideas from there.
Our construction starts with a simplicial complex on the vertex set , whose facets are -simplices of the form for all . Alternatively, we can describe as the complex consisting of all the sets satisfying . For we denote by the -simplex of defined by
[TABLE]
Observe that is a rooted -tree, and that form a partition of unrooted vertices of this tree. With a slight abuse of notation, we denote this rooted -tree also by .
If , we define to be the rooted -tree obtained by attaching to a rooted vertex to each of the following -simplices
[TABLE]
If , we define recursively to be the rooted -tree obtained from by attaching rooted vertices to each of .
Proposition 8**.**
For every choice of integers , and , is a balanced -tree with facets and rooted vertices of min-density .
Before we prove Proposition 8, we first argue that the min-density of is attained on particularly nice sets of unrooted vertices.
Lemma 9**.**
There exist such that .
Proof.
For a set of unrooted vertices, let the neighborhood of be the set of simplices of that contain at least one vertex from . We denote it by . In particular, . Let denote a set of unrooted vertices that minimizes and is of maximum size among such sets.
We first claim that is of the form . Suppose, for the sake of contradiction, that for some . Pick such that , and . By the optimality conditions on , it follows that
[TABLE]
which is equivalent to
[TABLE]
respectively. It follows that \bigl{\lvert}N(S)\setminus N(S\setminus\{a\})\bigr{\rvert}<\bigl{\lvert}N(S\cup\{b\})\setminus N(S)\bigr{\rvert}. To reach a contradiction we now exhibit an injective map from to .
Assume that , for the other case is analogous, and consider the map
[TABLE]
If , that is , and then and it follows that . Moreover, if then . This implies that maps to ; this map is easily seen to be injective. The existence of contradicts the optimality of , and thus each is [math] or .
We can now partition where each is a maximal union of consecutive ’s. Since
[TABLE]
it follows that . This completes the proof. ∎
Proof of Proposition 8.
In view of Lemma 9, it remains to compute and show that it is minimal for and . Let . For ,
[TABLE]
so we focus on the cases .
We first express in terms of , where for
[TABLE]
The computations are easiest when . In this case, whenever a simplex meets , we necessarily have . For each there are exactly simplices such that . Therefore the number of simplices of that meet is . Adding the simplices contained in the facets counted by we obtain
[TABLE]
When , the computation is similar except that we also need to count the simplices of such that . Call such a simplex dangling. If is dangling, then . Furthermore, for each there are simplices such that and exactly of them are dangling. The total number of dangling simplices is thus
[TABLE]
yielding
[TABLE]
Next, note that if and , then . Similarly, if , and , then . As we look for the minimum density, we may assume that
[TABLE]
We can thus assume that with and that either or with .
If then
[TABLE]
with equality if . If then
[TABLE]
As this exceeds , we conclude that the min-density of is attained for and is . ∎
3.3 Wrapping up
We can now prove Theorem 2. Let be a rational number, let denote its denominator, and let . We want to prove that for any simplicial complex ,
[TABLE]
Assume that has more than simplices on vertices. We will show that some of its vertices must span more than simplices. We assume that as otherwise the statement holds trivially.
For each , let . Let and note that . In particular, our assumption states that contains more than simplices. We note that .
Case 1.
We first consider the case where there is some with such that contains more than simplices of dimension . Pick the smallest such .
Now, delete from all -simplices of degree less than , also removing any simplices that contain them. By minimality of , doing so removes fewer than of the -simplices. Hence, the resulting complex contains at least one -simplex, and satisfies .
Write the rational number in the form with . Let be a balanced rooted -tree with parameters and as given by Proposition 8. Note that . By Lemma 6 there is set of vertices on (and hence of ) that spans at least simplices. The requisite bound follows from and , and from .
Case 2.
In the remaining case, the number of simplices of dimension up to is at most
[TABLE]
so contains more than simplices of dimension greater than .
If contains a -simplex for some , then any -set containing this simplex contains at least simplices. So assume that is of dimension at most .
By the pigeonhole principle, there is a with such that contains at least simplices of dimension .
Since contains at most simplices of dimension , it follows by another application of the pigeonhole principle that there is a -simplex that is contained in at least simplices of dimension . Lemma 7 then yields an -element set spanning at least
[TABLE]
simplices. We have . Also, .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[Alo 83] Noga Alon. On the density of sets of vectors. Discrete Mathematics , 46(2):199–202, 1983.
- 2[BC 15] Boris Bukh and David Conlon. Rational exponents in extremal graph theory. ar Xiv preprint ar Xiv:1506.06406 , 2015.
- 3[Bon 72] John A Bondy. Induced subsets. Journal of Combinatorial Theory, Series B , 12(2):201–202, 1972.
- 4[BR 95] B. Bollobás and A.J. Radcliffe. Defect Sauer results. Journal of Combinatorial Theory, Serie A , 72:189–208, 1995.
- 5[CGN 13] O. Cheong, X. Goaoc, and C. Nicaud. Set systems and families of permutations with small traces. European Journal of Combinatorics , 34:229–239, 2013.
- 6[Fit 16] Matthew Fitch. Rational exponents for hypergraph Turan problems. ar Xiv preprint ar Xiv:1607.05788 , 2016.
- 7[FP 94] Z. Füredi and J. Pach. Traces of finite sets: extremal problems and geometric applications. Bolyai Soc. Math. Stud , pages 255–282, 1994.
- 8[Fra 83] Peter Frankl. On the trace of finite sets. Journal of Combinatorial Theory, Series A , 34(1):41–45, 1983.
