Phase transitions in integer linear problems
S. Colabrese, D. De Martino, L. Leuzzi, E. Marinari

TL;DR
This paper explores phase transitions in the solvability of sparse integer linear problems, revealing a Van-Der-Waals phase diagram and increased computational difficulty near critical points.
Contribution
It introduces a phase diagram framework for understanding the computational complexity of integer linear problems based on ensemble analysis.
Findings
Feasible solution space exhibits a Van-Der-Waals phase diagram.
Computational difficulty peaks near the critical point.
Increased complexity observed in the coexistence region.
Abstract
The resolution of linear system with positive integer variables is a basic yet difficult computational problem with many applications. We consider sparse uncorrelated random systems parametrised by the density and the ratio between number of variables and number of constraints . By means of ensemble calculations we show that the space of feasible solutions endows a Van-Der-Waals phase diagram in the plane (, ). We give numerical evidence that the associated computational problems become more difficult across the critical point and in particular in the coexistence region.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Phase transitions in integer linear problems.
S. Colabrese1, D. De Martino2, L. Leuzzi3,4 and E. Marinari4,3,5
1 Dip. di Fisica, Univ. Tor Vergata, I-00133 Rome, Italy
2 Institute of Science and Technology Austria (IST Austria), Am Campus 1, Klosterneuburg A-3400, Austria
3 Soft and Living Matter Lab., Rome Unit of CNR-NANOTEC, Institute of Nanotechnology, National Research Council of Italy, P.le A. Moro 5, I-00185, Rome, Italy
4 Dipartimento di Fisica, Sapienza Universitá di Roma, P.le A.Moro 2, 00815, Rome, Italy
5 INFN, Sezione di Roma 1.
Abstract
The resolution of linear system with positive integer variables is a basic yet difficult computational problem with many applications. We consider sparse uncorrelated random systems parametrised by the density and the ratio between number of variables and number of constraints . By means of ensemble calculations we show that the space of feasible solutions endows a Van-Der-Waals phase diagram in the plane (, ). We give numerical evidence that the associated computational problems become more difficult across the critical point and in particular in the coexistence region.
1 Introduction
The use of concepts and techniques from statistical mechanics to analyze hard combinatorial problems has been very fruitful[1, 2], with examples that range from the paradigmatic K-sat[3, 4, 5, 6] to the traveling salesman problem[7] and number partitioning[8] to cite few.
Generically, upon defining an ensemble of random instances of the problem described by some parameters we can analyze how the solutions, and their structure, depend on these parameters. A typical example is a system with constraints connecting different variables. One significant parameter will, then, be the average connectivity of a single variable and another one the ratio between the number of variables and the number of constraints. If we are able to assess the statistics of the number of solutions and compute the average over different instances of the problem, we can adopt this solution entropy, possibly extensive, as a thermodynamic potential and look at singularities in the derivatives with respect to the parameters that can be interpreted as phase transitions among different phases [9, 10].
It turns out that, typically, we can distinguish a region in the parameter space in which solutions to the problem can always be found
- the SAT phase - from a region in which no solution can be found - the UNSAT phase. The associated decision problem, i.e. determining whether a solution exists or not, can be efficiently solved in regions deeply within each phase whereas it becomes effectively hard on the boundary between them [11].
A fundamental problem in combinatorics deals with the resolution of linear systems with non negative integer variables. The latter integrality constraint makes this issue computationally hard but still it inherits some useful results from linear and convex algebra[12]. Several kinds of problems can be assessed: decide if the system has non-trivial solutions (decision problem), describe the space by exhibiting a so-called Hilbert basis[13], count the lattice points inside a convex region[14] or maximize a linear function over integer vectors in a polyhedron (integer linear programming[15]). The latter problem is particularly interesting since it admits a straightforward approximation upon relaxing the integrality constraint and thus defining a linear programming problem, that is efficiently solvable in polynomial time and is connected to an important dual problem[12]. A fundamental strategy in combinatorial optimization thus consists in formulating problems as integer programs. Apart from classical applications in operations research and planning, integer-linear modeling is massively used in the field of constraint-based models of metabolic networks, in which the resolution of such systems is required for the computation of extreme pathways[16], elementary flux modules[17], conserved moieties[18, 19] and thermodynamically unfeasible cycles[20, 21]. We do point out, further, the recent surge of interest in statistical physics of the Gardner problem[22] of calculating the volume of the space of interactions of neural network models for given patterns, as it has been shown that this can be considered as a mean field model of jamming[23] of hard objects, that - in turn - can be connected to the resolution of integer linear systems. Indeed, it has been recently shown that the feasibility of the constraints defining such spaces is ruled by duality theorems defining dual spaces representing unfeasible patterns, that are represented by integer linear systems of equations[24].
Inspired by the aforementioned statistical mechanical approaches, we analyze here the solution spaces of large random sparse integer linear systems. In Sec. 2 we define the model and the ensemble, in Sec. 3 we present the thermodynamical behavior of these problems and their phase diagram obtained by means of annealed and of quenched replica calculations. In Sec. 4, we show how this framework can be used to analyze algorithms apt to solve integer programs and, finally, draw our conclusions.
2 The model
We consider a linear system of equations in unknowns :
[TABLE]
We will consider both the case where the variables are non-negative integers and the case where they are Boolean variables, i.e. they can only take the values [math] and . We will consider an ensemble of random systems where the coefficients are independent, identically distributed random variables:
[TABLE]
We study systems where (with a intensive constant). These systems are sparse, with a Poissonian structure, where is the average number of variables per equation. We will analyze large systems, where , , keeping fixed.
We can map the solutions of system (1) onto ground states of a model whose Hamiltonian has the form
[TABLE]
where
[TABLE]
It is interesting to consider the more general problem of calculating the partition function
[TABLE]
For example computing in the limit gives the number of solutions of (1).
The case where will turn out to be different from the general case where the are semi-positive definite integer variables (always with coefficients that can take the three values [math], and ).
3 The disorder average
3.1 The annealed approximation for , finite T.
We will start by studying the model in the annealed approximation, i.e. by averaging the partition function (and not its logarithm, as we will do in the quenched approach) over the disorder (2). We denote by an over-line the average over the disorder. We have that
[TABLE]
and, by developing the binomial,
[TABLE]
Since is small, we can set and use the relation
[TABLE]
where the are the modified Bessel functions of the first kind of order ([25]).
Now the Gaussian integral can be evaluated by noticing that
[TABLE]
and it gives
[TABLE]
By using the Stirling approximation for the binomial coefficients and the fact that is small (that allows to write for ) we write
[TABLE]
that defines
[TABLE]
where and
[TABLE]
The use of the Stirling approximation is justified when and are large with finite (i.e. our approach is valid for diluted models). We have also verified this fact numerically, by exact enumeration. We can now evaluate in our approximation by a saddle point approximation. We have to find the maxima of as a function of for ( emerges here as a natural order parameter).
The order parameter that emerges from the saddle point calculation is the average number of active variables per solution of system (1). Its asymptotic value for , where every vector variable is a solution, is .
By deriving with respect to we obtain a self consistent equation for its stationary points
[TABLE]
that can be solved numerically for each value of in order to obtain thermodynamic quantities like the free energy (), the energy () and the entropy (). In Fig. 1 we show the energy density obtained from these annealed analytic computation for different choices of , and and from a Monte Carlo simulations (of the original, quenched theory). No unexpected effects are seen, and the agreement is very reasonable.
By taking the limit we have
[TABLE]
The self-consistent equation now reads (noticing that )
[TABLE]
and can be simply solved in . In this way we obtain the equation of state
[TABLE]
In Fig. 2 we can see that for some values of there are three solutions (one on an unstable branch): the curves (16) are similar to the isothermal curves of a Van Der Waals fluid and the equilibrium curve can be obtained by Maxwell construction (Fig. 2, right).
We can calculate the stationary points from , whose solutions inserted back into (16) give a coexistence region that shrinks into a second-order critical point. The situation is summarized in the phase diagram in Fig. 2 (bottom): there are two phases with respectively low (UNSAT) and high (SAT) values of with a coexistence region that shrinks into a second order critical point akin to the Van Der Waals phase diagram of a fluid.
3.2 The annealed approximation for , .
Consider the annealed sum, where now :
[TABLE]
We can expand
[TABLE]
and we have the integrals
[TABLE]
that, once again, can be solved upon approximating
, where , and using the Bessel functions formula
[TABLE]
obtaining
[TABLE]
Upon defining the variables the saddle point , where
[TABLE]
it comes from the solution of the following optimization problem:
[TABLE]
For , is symmetric under permutation of the variables. In this limit we search for a symmetric solution , and we obtain an extended expression of for :
[TABLE]
Upon taking the maximum, the equation of state now reads
[TABLE]
The picture is qualitatively the same as before, i.e. a Van der Waals picture with first order transition ending in a second order critical point. The values of the first order critical and spinodal lines shift to higher values of at increasing . In fig 3 we plot the boundary off the coexistence region in the plane for several values of : upon increasing we witness simply to a shift of such a region towards higher values of and .
3.3 The quenched case for ,
Let us now look to the quenched case, where we assume that the coefficients are fixed, and compute the expectation value of the logarithm of the partition function . We will use replicas, and by introducing copies of the system labeled by a further index , , we will compute
[TABLE]
By expanding, as before, with first order modified Bessel functions we get that
[TABLE]
The terms containing at least a cosine function upon integration produce terms proportional to , that go to zero when , that is in the limit of interest for us. So we have that
[TABLE]
where runs over the possible combinations of indices. This defines, with the same set of approximation we had used before in the annealed case,
[TABLE]
The saddle point approximation can be computed by solving the optimization problem (with ) where one maximizes under the constraints and . Both the function and the domain of the search are symmetric under permutation of variables. We look for a replica symmetric solution, with . The case coincides with the annealed approximation. In the limit we have:
[TABLE]
The equation of state is
[TABLE]
Interestingly, in this case the boundary of the coexistence region can be given in parametric form from the equation , i.e.:
[TABLE]
where .
Figure 4 shows a behavior that is qualitatively very similar to the one of the annealed approximation (see Fig. 2). Upon looking at the shrinking of the coexistence region we can see that , i.e. it has a mean field behavior.
We will show how the knowledge of the system thermodynamics discussed so far can be used to analyze algorithms solving integer-linear optimization problems. We will show that the difficulty of performing an optimization task is related to the position of the instance in the phase diagram. In essence, sticking to the Van-Der-Waals metaphor, we can distinguish a purely gaseous () from a vapor () phase, where the finite size scaling exponent of the execution time increases and becomes strongly parameters-dependent, in particular reaching a maximum in the coexistence region in correspondence of the transition line.
4 Resolution of single instances
The statistical mechanical analysis of the system, that we have discussed in the previous section, is of paramount importance, since it allows us to qualify the typical behavior of the system, to discuss fluctuations, to examine critical and collective behavior and phase transitions. A second side of the problem is very important, and it is based on analyzing single instances of the problem. We do not consider here statistical averages, but the detailed behavior of a system for a given realization of the coefficients. For example the work of [6] has stressed how useful this can be (in the context, in that case, of satisfiability problems), even for developing new, powerful optimization algorithms. We consider here for simplicity a linear optimization problem (integer programming) defined over the system (1), where we maximize the objective function
[TABLE]
We employ the Matlab© solver intlinprog. We consider two vertical lines in the phase diagram at fixed (on different sides with respect to the critical point ) and perform a sweep over , points with fixed number of variables averaged over instances. In the figure below we report the relative average length of solutions (left) and the typical machine time (right, quad-core running at GHz) as a function of for the two cases.
In both cases we see a crossover from an UNSAT to a SAT phase that seems to be steeper for . The machine time is almost constant for while for it develops a maximum in correspondence of the coexistence region. Next we studied the machine time as a function of the system size ( number of variables) upon fixing the point in the phase diagram (finite size scaling): on two vertical lines () we consider .
In all cases we observe an exponential trend . While for the exponent is small (, i.e. we arrive up to ) and weakly dependent on , for the exponent is big ( we arrive up to ) and dependent on , with a peak in the coexistence region.
Conclusions
The analysis and resolution of difficult combinatorial problems has gained many insights from statistical mechanics, both in terms of a characterization of general properties of the solution space and in terms of providing new powerful resolution algorithms. A widely studied class of hard problems is the class of integer linear systems, given the wide scope of applications and the theoretical insights that are inherited from linear and convex algebra. In this note we have considered ensembles of sparse uncorrelated random systems parametrised by the average number of variables per equation and the ratio between total number of variables and total number of constraints . We have shown that the space of feasible solutions endows a Van-Der-Waals phase diagram in the plane (, ). We have found that the phase transition scenario depends on the integrality constraint, eventually vanishing for generic positive integers. We gave numerical evidence that associated computational problems become more difficult in regions where a SAT-UNSAT phase transition is present and, in particular, in the coexistence region.
Several further analysis could be of interest. First, we gave a picture for the whole solution space while it would be interesting to consider specific sets of optimization problems upon adding additional terms in the Hamiltonian (3). It is worthwhile noticing that the phase transition scenario that we have depicted is essentially within the universality class of the Ising model at odds with K-SAT problems that endows more complex phase transition scenario [5]. Along this line a possible development would be to test the robustness of the universality class across different ensembles in order to check whether it is related to some universal feature of the problem and at odds with the K-SAT behavior, e.g. in the convexity property. Finally, this statistical mechanical treatment could give relevant insights in algorithmic strategies like message passing and Monte Carlo, whose application in integer linear problems has been already used in a different context[19].
Appendix: Modified Bessel function of the first kind
We report here some useful formulas on modified Bessel function of the first kind from[25] that have been used in the above calculations (analytical and/or numerical).
Representations:
[TABLE]
Generating function:
[TABLE]
Approximation:
[TABLE]
Derivatives:
[TABLE]
For ratios see[26]:
[TABLE]
Acknowledgments
The research leading to these results has received funding from the People Programme (Marie Curie Actions) of the European Union’s Seventh Framework Programme () under REA grant agreement (DDM). DDM would like to thank Alexandr Kazda and Davide Raimondo for interesting discussions.
References
- [1]
Olivier C Martin, Rémi Monasson, and Riccardo Zecchina.
Statistical mechanics methods and phase transitions in optimization problems.
Theoretical computer science, 265(1):3–67, 2001.
- [2]
Marc Mezard and Andrea Montanari.
Information, physics, and computation.
Oxford University Press, 2009.
- [3]
R. Monasson and R. Zecchina.
Statistical mechanics of the random k-sat problem.
Phys. Rev. E, 56:1357 – 1361, 1997.
- [4]
L. Leuzzi and G. Parisi.
The k-sat problem in a simple limit.
J. Stat. Phys., 103:679 – 695, 2001.
- [5]
A. Crisanti, L. Leuzzi, and G. Parisi.
The 3-sat problem with large number of clauses in the -replica symmetry breaking scheme.
J. Phys. A, 35:481, 2002.
- [6]
Marc Mézard, Giorgio Parisi, and Riccardo Zecchina.
Analytic and algorithmic solution of random satisfiability problems.
Science, 297(5582):812–815, 2002.
- [7]
Scott Kirkpatrick and Gérard Toulouse.
Configuration space analysis of travelling salesman problems.
Journal de Physique, 46(8):1277–1292, 1985.
- [8]
Stephan Mertens.
Phase transition in the number partitioning problem.
Physical Review Letters, 81(20):4281, 1998.
- [9]
Alexander K Hartmann and Martin Weigt.
Phase transitions in combinatorial optimization problems: basics, algorithms and statistical mechanics.
John Wiley & Sons, 2006.
- [10]
Stephan Mertens.
Computational complexity for physicists.
Computing in Science & Engineering, 4(3):31–47, 2002.
- [11]
Rémi Monasson, Riccardo Zecchina, Scott Kirkpatrick, Bart Selman, and Lidror Troyansky.
Determining computational complexity from characteristic ‘phase transitions’.
Nature, 400(6740):133–137, 1999.
- [12]
Alexander Schrijver.
Theory of linear and integer programming.
John Wiley & Sons, 1998.
- [13]
Martin Henk and Robert Weismantel.
On Hilbert bases of polyhedral cones.
Konrad-Zuse-Zentrum für Informationstechnik, 1996.
- [14]
Alexander Barvinok.
Integer points in polyhedra, volume 452.
European Mathematical Society, 2008.
- [15]
Hendrik W Lenstra Jr.
Integer programming with a fixed number of variables.
Mathematics of operations research, 8(4):538–548, 1983.
- [16]
Christophe H Schilling, David Letscher, and Bernhard Ø Palsson.
Theory for the systemic definition of metabolic pathways and their use in interpreting metabolic function from a pathway-oriented perspective.
Journal of theoretical biology, 203(3):229–248, 2000.
- [17]
Stefan Schuster and Claus Hilgetag.
On elementary flux modes in biochemical reaction systems at steady state.
Journal of Biological Systems, 2(02):165–182, 1994.
- [18]
Stefan Schuster and Thomas Höfer.
Determining all extreme semi-positive conservation relations in chemical reaction systems: a test criterion for conservativity.
Journal of the Chemical Society, Faraday Transactions, 87(16):2561–2566, 1991.
- [19]
Andrea De Martino, Daniele De Martino, Roberto Mulet, and Andrea Pagnani.
Identifying all moiety conservation laws in genome-scale metabolic networks.
PloS one, 9(7):e100750, 2014.
- [20]
Arne Müller.
Thermodynamic constraints in metabolic networks.
PhD thesis, Master’s thesis, Freie Universität Berlin, Fachbereich Mathematik und Informatik. http://page. mi. fu-berlin. de/arnem/theses/master. pdf, 2012.
- [21]
Daniele De Martino, Fabrizio Capuani, Matteo Mori, Andrea De Martino, and Enzo Marinari.
Counting and correcting thermodynamically infeasible flux cycles in genome-scale metabolic networks.
Metabolites, 3(4):946–966, 2013.
- [22]
Elizabeth Gardner.
The space of interactions in neural network models.
Journal of physics A: Mathematical and general, 21(1):257, 1988.
- [23]
Silvio Franz and Giorgio Parisi.
The simplest model of jamming.
Journal of Physics A: Mathematical and Theoretical, 49(14):145001, 2016.
- [24]
Daniele De Martino.
The dual of the space of interactions in neural network models.
International Journal of Modern Physics C, 27(06):1650067, 2016.
- [25]
Milton Abramowitz and Irene A Stegun.
Handbook of mathematical functions: with formulas, graphs, and mathematical tables.
Number 55. Courier Corporation, 1964.
- [26]
DE Amos.
Computation of modified bessel functions and their ratios.
Mathematics of Computation, 28(125):239–251, 1974.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Olivier C Martin, Rémi Monasson, and Riccardo Zecchina. Statistical mechanics methods and phase transitions in optimization problems. Theoretical computer science , 265(1):3–67, 2001.
- 2[2] Marc Mezard and Andrea Montanari. Information, physics, and computation . Oxford University Press, 2009.
- 3[3] R. Monasson and R. Zecchina. Statistical mechanics of the random k-sat problem. Phys. Rev. E , 56:1357 – 1361, 1997.
- 4[4] L. Leuzzi and G. Parisi. The k-sat problem in a simple limit. J. Stat. Phys. , 103:679 – 695, 2001.
- 5[5] A. Crisanti, L. Leuzzi, and G. Parisi. The 3-sat problem with large number of clauses in the ∞ \infty -replica symmetry breaking scheme. J. Phys. A , 35:481, 2002.
- 6[6] Marc Mézard, Giorgio Parisi, and Riccardo Zecchina. Analytic and algorithmic solution of random satisfiability problems. Science , 297(5582):812–815, 2002.
- 7[7] Scott Kirkpatrick and Gérard Toulouse. Configuration space analysis of travelling salesman problems. Journal de Physique , 46(8):1277–1292, 1985.
- 8[8] Stephan Mertens. Phase transition in the number partitioning problem. Physical Review Letters , 81(20):4281, 1998.
