Sparsity of integer solutions in the average case
Timm Oertel, Joseph Paat, Robert Weismantel

TL;DR
This paper investigates the average sparsity of feasible solutions in integer programming, showing that solutions tend to be sparse with O(m) nonzero entries under mild conditions, using advanced mathematical tools.
Contribution
It establishes that, on average, integer program solutions are sparse with O(m) nonzero entries, improving understanding of solution structure.
Findings
Feasible solutions in integer programs are sparse on average.
The proof employs group, lattice, and Ehrhart polynomial theories.
Provides new upper bounds on the integer Caratheodory number.
Abstract
We examine how sparse feasible solutions of integer programs are, on average. Average case here means that we fix the constraint matrix and vary the right-hand side vectors. For a problem in standard form with m equations, there exist LP feasible solutions with at most m many nonzero entries. We show that under relatively mild assumptions, integer programs in standard form have feasible solutions with O(m) many nonzero entries, on average. Our proof uses ideas from the theory of groups, lattices, and Ehrhart polynomials. From our main theorem we obtain the best known upper bounds on the integer Caratheodory number provided that the determinants in the data are small.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
11institutetext: School of Mathematics, Cardiff University, United Kingdom22institutetext: Institute for Operations Research, ETH Zürich, Switzerland
Sparsity of integer solutions in the average case
Timm Oertel 11
Joseph Paat 22
Robert Weismantel 22
Abstract
We examine how sparse feasible solutions of integer programs are, on average. Average case here means that we fix the constraint matrix and vary the right-hand side vectors. For a problem in standard form with equations, there exist LP feasible solutions with at most many nonzero entries. We show that under relatively mild assumptions, integer programs in standard form have feasible solutions with many nonzero entries, on average. Our proof uses ideas from the theory of groups, lattices, and Ehrhart polynomials. From our main theorem we obtain the best known upper bounds on the integer Carathéodory number provided that the determinants in the data are small.
1 Introduction
Let and . We always assume that has full row rank. We also view as a set of its column vectors. So, implies that is a subset of the columns of .
We aim to find a sparse integer vector in the set
[TABLE]
where . That is, we aim at finding a solution such that is as small as possible, where for . To this end, we define the support function of to be
[TABLE]
If , then . We define the support function of to be
[TABLE]
The question of determining generalizes problems that have been open for decades. A notable special case is the so-called integer Carathéodory number, i.e. the minimum number of Hilbert basis elements in a rational pointed polyhedral cone required to represent an integer point in the cone. We say that has the Hilbert basis property if its columns correspond to a Hilbert basis of . For with the Hilbert basis property, Cook et al. [8] showed that and Sebő showed that [12]. Bruns et al. [7] provide an example of with the Hilbert basis property with . However, for matrices with the Hilbert basis property, the true value of is unknown.
For general choices of , Eisenbrand and Shmonin [10] showed that , where is the max norm. Aliev et al. [2] and Aliev et al. [1] improved the previous result and showed that
[TABLE]
where . It turns out that the previous upper bound is close to the true value of . In fact, for every , Aliev et al. [1] provide an example of for which .
In this paper, we consider for most choices of . We formalize this ‘average case’ using the asymptotic support function of defined by
[TABLE]
Note that .
The value can be thought of as the smallest such that almost all feasible integer programs with constraint matrix have solutions with support of cardinality at most . The function was introduced by Bruns and Gubeladze in [6], where it was shown that for matrices with the Hilbert basis property. In general, an average case analysis of the support question has not been provided in the literature. Average case behavior of integer programs has been studied in specialized settings, see, e.g., [9] for packing problems in variables and [3] for problems with only one constraint. However, to the best of our knowledge, there are no other studies available that are concerned with the average case behavior of integer programs, in general.
Our analysis reveals that the sizes of the minors of affect sparsity. It turns out that the number of factors in the prime decomposition of the minors also affects sparsity. Moreover, for matrices with large minors but few factors, there exist solutions whose support depends on the number of factors rather than the size of the minors. Recall that a prime is a natural number greater than or equal to that is divisible only by itself and . We now formalize these parameters related to the minors of a matrix.
Let be of full row rank, where . Denote the set of absolute values of the minors by
[TABLE]
and denote the set of ‘number of prime factors’ in each minor by
[TABLE]
If consists of only one element (e.g., when ), then we denote the element by . If and , then . We denote the maximum and minimum of these sets by
[TABLE]
Our first main result bounds using these parameters.
Theorem 1.1
Let and such that Then
- (i)
, 2. (ii)
.
Theorem 1.1 guarantees that the average support is linear in in two special cases: (a) the minimum minor of is on the order of or (b) there is a prime minor. We emphasize that (ii) uses the minimum values and , which can be bounded by sampling any invertible submatrix of . Thus, can be bounded by finding a single invertible submatrix of .
Note that the bound in (1) includes the term . Our proof of Theorem 1.1 can be adjusted to prove and . We omit this analysis here to simplify the exposition. However, it should be mentioned that
[TABLE]
where the equation follows from the so-called Cauchy-Binet formula. Therefore, if has two nonzero minors, then Theorem 1.1 (i) improves (1), on average.
A corollary of Theorem 1.1 is that if has the Hilbert basis property, then the extreme rays of provide enough information to bound .
Corollary 1
Let and . Assume that has the Hilbert basis property and . Then
[TABLE]
If , then the bound in Corollary 1 improves the bound in [6].
By modifying a construction in [1], we obtain two interesting examples of . The first example shows that Theorem 1.1 (i) gives a tight bound. The second example shows that Theorem 1.1 (ii) gives a tight bound and that can be significantly smaller than .
Theorem 1.2
For every and , there is a matrix such that and .
For every and , there is a matrix such that and
The proof of Theorem 1.1 is based on a combination of group theory, lattice theory, and Ehrhart theory. On a high level, the combination of group and lattice theory bears similarities to papers of Gomory [11] and Aliev et al. [2]. Gomory investigated the value function of an IP and proved its periodicity when the right-hand side vector is sufficiently large. Aliev et al. showed periodicity for the function provided again that is sufficiently large. Our refined analysis allows us to quantify the number of right-hand sides for which the support function is small. This new contribution requires not only group and lattice theory, but also Ehrhart theory.
Sections 2 and 3 we provide background on groups and subcones. In Section 4 we use the average support for each subcone to prove Theorem 1.1. We prove Theorem 1.2 in Appendix 0.A.
2 The group structure of a parallelepiped
Let be an invertible matrix, which we also view as a set of linearly independent column vectors. Let denote the integer vectors in the fundamental parallelepiped generated by :
[TABLE]
For each , there is a unique such that , where [5, Lemma 2.1, page 286]. Thus, we can define a residue function by
[TABLE]
The image of under (i.e., ) creates a group using the operation defined by
[TABLE]
The identity of is the zero vector in , and
[TABLE]
see, e.g., [5, Corollary 2.6, page 286]. Equation (4) implies is finite.
The choice of notation for is to emphasize that it is the group generated by the residues of all integer linear combinations of vectors in . We can also consider the group generated by any subset of vectors in . Given , we denote the subgroup of generated by by
[TABLE]
If , then . The set is a subgroup of because is a sublattice of .
We collect some basic properties about the group .
Lemma 1
Let be an invertible matrix. For every , .
Proof
For each , we can write as
[TABLE]
Thus, it suffices to show for each . If , then . If , then because is finite there exists with . Note that , so . ∎
Lemma 2
Let be an invertible matrix and . If with , then there exist (possibly with repetitions) such that .
Proof
Set . First, we show that for each there exist (possibly with repetitions) such that
[TABLE]
We prove (6) by induction on . The result is vacuously true for , so assume that (6) holds for and consider . Define
[TABLE]
By the induction hypothesis, there exist such that (6) holds. If , then proves (6) for . If , then by (6) and induction. Recall from (3). If for every , then and , which is a contradiction. Thus, there exists such that . The sequence , satisfies (6), which proves (6).
Let be chosen to satisfy (6). If , then set to conclude . It is left to consider the case when . We claim that this leads to a contradiction.
By (2) and (4), for primes . By (7), are subgroups of , so divide (see, e.g., [4, Chapter 2]). Also, and (6) imply that . Hence, and divides for each . This implies that has at least many prime factors. However, , and only has many prime factors. Thus, for some , which contradicts . ∎
3 Lattice points in cones
A set is a lattice if , for , and if then (see, e.g., [5, Chapter VII]). So, is a subgroup of . We assume that a lattice contains linearly independent vectors. For and , set .
We use following lemma to find suitable translated subcones in which is bounded. The proof of Lemma 3 is in Appendix 0.B.
Lemma 3
Let be linearly independent vectors and set . For and , there is a , where , such that .
Let . For each , Carathéodory’s Theorem implies that there is a linearly independent set such that . Thus,
[TABLE]
where and are the linearly independent subsets of . The following lemma states that for a given lattice , ‘most’ of the points in are found in translations of the subcones .
Lemma 4
Let be such that is -dimensional. Let and be as in (8). Let be a lattice and assume that . For each , choose any for each , and define . Then
[TABLE]
Proof
For set . The fraction in (9) equals
[TABLE]
which is at least as large as
[TABLE]
Thus, in order to prove (9), it is enough to prove
[TABLE]
By assumption, is -dimensional. Thus, we may assume that the sets each have linearly independent vectors.
Let and be the points that are coordinate-wise at most one more than in the coordinate system defined by :
[TABLE]
The set is finite.
The numerator of (10) considers , so take . We claim that
[TABLE]
where and with . Write as where for each and for some . We have for each and because and . In particular, and where . This proves (11). Note that we use the fact that is defined by rather than : if was defined by , then in the extreme case and , the vector is not in .
We use the fact that to show is contained in finite union of lower dimensional spaces. Although we showed , we can assume by extending it arbitrarily to have columns and setting for these new columns. Hence,
[TABLE]
For each and with , define the polytope
[TABLE]
By assumption, for each , so the vertices of are in . Ehrhart theory then implies that there is a polynomial of degree such that
[TABLE]
for each . The leading coefficient of is the dimensional volume of , which is positive, see [5, Chapter VIII]. Similarly, for the polytope
[TABLE]
there exists a polynomial of degree with positive leading coefficient such that for each
[TABLE]
Define
[TABLE]
We show that the values in (10) go to zero as by bounding the fraction
[TABLE]
for each . By the definition of , for every . So for each , say , it follows that
[TABLE]
Hence,
[TABLE]
If and , then, by (3), for , with , and for each . This implies that
[TABLE]
Hence,
[TABLE]
If , then by the definition of , . This implies that the number of points in is equal to . So,
[TABLE]
The polynomial on the right-hand side of (14), call it , is of degree and has a positive leading coefficient. Also, by (13) and (14),
[TABLE]
Recall that is of degree , is of degree , and and have positive leading coefficients. Moreover, the limit as is the same as . Hence,
[TABLE]
4 Proof of Theorem 1.1
The assumption indicates that we can write as
[TABLE]
where and are linearly independent sets; see (8). Also, has full row rank, so we assume that each contain linearly independent vectors. For , let .
First, we prove . In order to do this, we find a lattice and points such that
[TABLE]
and contains every such that . With these values, we will be able to apply Lemma 4 to prove the desired result.
Fix and set Let be the group defined in Section 2. In view of Lemma 2, there exist with and
[TABLE]
We emphasize that the choice of depends on . Define the lattice
[TABLE]
In Lemma 6, we show that does not depend on . Lemma 1 implies that . Thus,
[TABLE]
Lemma 5
There exists that satisfies the following: for every , either (so ) by (16)) or . The vector satisfies , where for each .
- Proof of Lemma. For each , there exists such that
[TABLE]
where and for each and . By Lemma 3, there exists such that and Let such that . By (16), there is a such that . So, by (17),
[TABLE]
where for each . Note for each because . Thus, and . ∎
Lemma 6
For every pair , the lattices and are equal.
- Proof of Lemma. It is enough to show that . Let . By Lemmata 3 and 5, there is a point such that . Also, by Lemma 5, . Hence, by (16), . Similarly, for each . These inclusions along with imply . ∎
Set . Lemma 5 implies that
[TABLE]
By (15) and (16), it follows that
[TABLE]
Hence, for each , it follows that
[TABLE]
By Lemma 4, it follows that . Also, the inequality for each implies and .
Consider the inequality . Without loss of generality, . Let be given from Lemma 5. Let . Using Lemma 3 and the fact that is -dimensional, the representative set from (17) can be chosen in . Let . By (18), there exists a such that
[TABLE]
where for each . The point is in , so and there are such that where . So,
[TABLE]
Thus, and . Hence, .
Finally, assume . Observe that , so . ∎
Acknowledgements
The authors would like to thank anonymous referees for helping to improve the presentation of the paper and Marie Putscher for identifying a typo in the proof of Lemma 4.
Appendix 0.A Proof of Theorem 1.2
We construct both matrices and using a submatrix , which we construct first. Let and be prime. For , define and . Define the matrix The matrix has columns, so . The matrix is similar to the example in [1, Theorem 2] and the theory of so-called primorials. We claim
[TABLE]
Note that . The Frobenius number of is the largest integer that cannot be written as a positive integer linear combination of and . Hence, if we choose to be the Frobenius number of , then implies . If , then is not divisible by for any . Thus, if and , then . Finally, observe that if , then for large enough . The only negative column of is , so . This proves (20).
Now we define the matrix . Let and define
[TABLE]
where is the identity matrix and is the all zero matrix for . Note that . If is such that the last component is equivalent to , then by the arguments above. Now, the set of such that is contained in . So, for every , the set of feasible solutions in contains points such that . Moreover, if , then for every . Therefore,
[TABLE]
Using this and the fact that has columns, we have .
Now we define the matrix . Let be as above. Let be the all ones matrix and . Assume
[TABLE]
and set
[TABLE]
Note that , so Theorem 1.1 (ii) implies that . Let be such that . If , then the first components of are zero. So, similarly to above, there are such that . Hence, . ∎
Appendix 0.B Proof of Lemma 3
Assume that . Let and . First, we show that . Since are linearly independent, is a full-dimensional simplicial cone. Hence, there exist linearly independent vectors such that and linearly independent vectors such that for each .
There is a set such that for each and for each . For , set
[TABLE]
Note that , so For each , it follows that
[TABLE]
So, and . Finally, for each , it follows that
[TABLE]
Hence, and .
Let . Then . Because is full-dimensional, there exists a point such that for . Note that and
[TABLE]
For , the result follows by induction. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Aliev, I., De Loera, J., Eisenbrand, F., Oertel, T., Weismantel, R.: The support of integer optimal solutions. SIAM Journal on Optimization 28 , 2152–2157 (2018)
- 2[2] Aliev, I., De Loera, J., Oertel, T., O’Neil, C.: Sparse solutions of linear diophantine equations. SIAM Journal on Applied Algebra and Geometry 1 , 239–253 (2017)
- 3[3] Aliev, I., Henk, M., Oertel, T.: Integrality gaps of integer knapsack problems. In: Eisenbrand F., Koenemann J. (eds) Integer Programming and Combinatorial Optimization. Lecture Notes in Computer Science. pp. 808–816 (2017)
- 4[4] Artin, M.: Algebra. Prentice Hall, New Jersey (1991)
- 5[5] Barvinok, A.: A Course in Convexity, vol. 54. American Mathematical Society, Providence, Rhode Island (2002)
- 6[6] Bruns, W., Gubeladze, J.: Normality and covering properties of affine semigroups. Journal für die reine und angewandte Mathematik 510 , 151 – 178 (2004)
- 7[7] Bruns, W., Gubeladze, J., Henk, M., Martin, A., Weismantel, R.: A counterexample to an integer analogue of Carathéodory’s theorem. Journal für die reine und angewandte Mathematik 510 , 179–185 (1999)
- 8[8] Cook, W., Fonlupt, J., Schrijver, A.: An integer analogue of Carathéodory’s theorem. Journal of Combinatorial Theory, Series B 40 (1), 63–70 (1986)
