Engineering Kernelization for Maximum Cut
Damir Ferizovic, Demian Hespe, Sebastian Lamm, Matthias Mnich, and Christian Schulz, Darren Strash

TL;DR
This paper develops and tests new kernelization data reduction rules for the Max Cut problem, significantly improving solver performance on benchmark instances and enabling solutions to previously unsolvable large networks.
Contribution
The authors engineer a comprehensive set of efficient kernelization rules for Max Cut and demonstrate their practical effectiveness on diverse benchmark datasets.
Findings
Speedups of up to multiple orders of magnitude in solver runtimes.
Successfully solved four previously unsolvable instances within a 10-hour limit.
Significant improvements on synthetic, VLSI, image segmentation, social, and biological network datasets.
Abstract
Kernelization is a general theoretical framework for preprocessing instances of NP-hard problems into (generally smaller) instances with bounded size, via the repeated application of data reduction rules. For the fundamental Max Cut problem, kernelization algorithms are theoretically highly efficient for various parameterizations. However, the efficacy of these reduction rules in practice---to aid solving highly challenging benchmark instances to optimality---remains entirely unexplored. We engineer a new suite of efficient data reduction rules that subsume most of the previously published rules, and demonstrate their significant impact on benchmark data sets, including synthetic instances, and data sets from the VLSI and image segmentation application domains. Our experiments reveal that current state-of-the-art solvers can be sped up by up to multiple orders of magnitude whenā¦
| Name | ||||||||
|---|---|---|---|---|---|---|---|---|
| ca-CSphd | 1ā882 | 0.99 | 24.07 | 0.32 | [75.40] | - | 0.06 | [] |
| ego-facebook | 2ā888 | 1.00 | 20.09 | 0.09 | [228.91] | - | 0.01 | [] |
| ENZYMES_g295 | 123 | 0.86 | 1.22 | 0.33 | [3.70] | 0.82 | 0.13 | [6.57] |
| road-euroroad | 1ā174 | 0.79 | - | - | - | - | - | - |
| bio-yeast | 1458 | 0.81 | - | - | - | - | 32ā726.75 | [] |
| rt-twitter-copen | 761 | 0.85 | - | 834.71 | [] | - | 1.77 | [] |
| bio-diseasome | 516 | 0.93 | - | 4.91 | [] | - | 0.07 | [] |
| ca-netscience | 379 | 0.77 | - | 956.03 | [] | - | 0.67 | [] |
| soc-firm-hi-tech | 33 | 0.36 | 4.67 | 1.61 | [2.90] | 0.09 | 0.06 | [1.41] |
| g000302 | 317 | 0.21 | 0.58 | 0.49 | [1.17] | 1.88 | 0.74 | [2.53] |
| g001918 | 777 | 0.12 | 1.47 | 1.41 | [1.04] | 31.11 | 17.45 | [1.78] |
| g000981 | 110 | 0.28 | 10.73 | 4.73 | [2.27] | 531.47 | 21.53 | [24.68] |
| g001207 | 84 | 0.19 | 1.10 | 0.16 | [6.88] | 53.20 | 0.06 | [962.38] |
| g000292 | 212 | 0.03 | 0.45 | 0.45 | [1.01] | 0.43 | 0.37 | [1.14] |
| imgseg_271031 | 900 | 0.99 | 10.66 | 0.19 | [55.94] | - | 0.17 | [] |
| imgseg_105019 | 3ā548 | 0.93 | 234.01 | 22.68 | [10.32] | f | 13ā748.62 | [] |
| imgseg_35058 | 1ā274 | 0.37 | 34.93 | 24.71 | [1.41] | - | - | - |
| imgseg_374020 | 5ā735 | 0.82 | 1ā739.11 | 72.23 | [24.08] | f | - | - |
| imgseg_106025 | 1ā565 | 0.68 | 159.31 | 34.05 | [4.68] | - | - | - |
| Name | ||||||||
|---|---|---|---|---|---|---|---|---|
| ca-CSphd | 1ā882 | 0.98 | 24.79 | 1.12 | [22.23] | - | 0.32 | [] |
| ego-facebook | 2ā888 | 0.93 | 20.39 | 1.72 | [11.83] | 967.99 | 1.42 | [682.04] |
| ENZYMES_g295 | 123 | 0.82 | 1.83 | 0.36 | [5.09] | 0.96 | 0.37 | [2.60] |
| road-euroroad | 1ā174 | 0.69 | - | - | - | - | - | - |
| bio-yeast | 1ā458 | 0.72 | - | - | - | - | - | - |
| rt-twitter-copen | 761 | 0.80 | - | 409.47 | [] | - | 101.14 | [] |
| bio-diseasome | 516 | 0.93 | - | 6.66 | [] | - | 0.35 | [] |
| ca-netscience | 379 | 0.67 | - | 4ā116.61 | [] | - | 2.10 | [] |
| soc-firm-hi-tech | 33 | 0.30 | 4.92 | 2.34 | [2.10] | 0.29 | 0.31 | [0.94] |
| g000302 | 317 | 0.10 | 0.71 | 0.50 | [1.41] | 1.28 | 0.89 | [1.44] |
| g001918 | 777 | 0.06 | 1.67 | 1.51 | [1.10] | 14.90 | 11.69 | [1.27] |
| g000981 | 110 | 0.22 | 11.32 | 1.97 | [5.74] | 0.98 | 0.44 | [2.23] |
| g001207 | 84 | 0.17 | 1.56 | 0.15 | [10.11] | 0.47 | 0.37 | [1.28] |
| g000292 | 212 | 0.01 | 0.69 | 0.51 | [1.35] | 0.56 | 0.62 | [0.91] |
| Name | ||||||
|---|---|---|---|---|---|---|
| inf-road_central | 14ā081ā816 | 1.20 | 0.59 | 362.32 | inf% | 2.70% |
| inf-power | 4ā941 | 1.33 | 0.62 | 0.04 | 1.64% | 0.45% |
| web-google | 1ā299 | 2.13 | 0.79 | 0.01 | 0.69% | 0.19% |
| ca-MathSciNet | 332ā689 | 2.47 | 0.63 | 8.02 | 1.33% | 0.55% |
| ca-IMDB | 896ā305 | 4.22 | 0.42 | 27.55 | 0.97% | 0.32% |
| web-Stanford | 281ā903 | 7.07 | 0.18 | 105.17 | 0.34% | 0.30% |
| web-it-2004 | 509ā338 | 14.09 | 0.91 | 22.10 | 0.08% | 0.02% |
| ca-coauthors-dblp | 540ā486 | 28.20 | 0.25 | 72.39 | 0.05% | 0.04% |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Engineering Kernelization for Maximum Cut111The research leading to these results has received funding from the European Research Council under the European Unionās Seventh Framework Programme (FP/2007-2013) / ERC Grant Agreement no. 340506.
Damir Ferizovic222Karlsruhe Institute of Technology, Karlsruhe, Germany, [email protected]
āā
Demian Hespe333Karlsruhe Institute of Technology, Karlsruhe, Germany, [email protected]
āā
Sebastian Lamm444Karlsruhe Institute of Technology, Karlsruhe, Germany, [email protected]
āā
Matthias Mnich555UniversitƤt Bonn, Bonn, Germany, [email protected], Supported by DFG grant MN 59/1-1.
āā
Christian Schulz666University of Vienna, Faculty of Computer Science, Vienna, Austria, [email protected]
āā
Darren Strash777Hamilton College, Clinton, New York, USA, [email protected]
Abstract
Kernelization is a general theoretical framework for preprocessing instances of -hard problems into (generally smaller) instances with bounded size, via the repeated application of data reduction rules. For the fundamental Max Cut problem, kernelization algorithms are theoretically highly efficient for various parameterizations. However, the efficacy of these reduction rules in practiceāto aid solving highly challenging benchmark instances to optimalityāremains entirely unexplored.
We engineer a new suite of efficient data reduction rules that subsume most of the previously published rules, and demonstrate their significant impact on benchmark data sets, including synthetic instances, and data sets from the VLSI and image segmentation application domains. Our experiments reveal that current state-of-the-art solvers can be sped up by up to multiple orders of magnitude when combined with our data reduction rules. On social and biological networks in particular, kernelization enables us to solve four instances that were previously unsolved in a ten-hour time limit with state-of-the-art solvers; three of these instances are now solved in less than two seconds.
1 Introduction
The (unweighted) Max Cut problem is to partition the vertex set of a given graph into two sets and so as to maximize the total number of edges between those two sets. Such a partition is called a maximum cut. Computing a maximum cut of a graph is a well-known problem in the area of computer science; it is one of Karpās 21 -complete problemsĀ [26] While signed and weighted variants are often considered throughout the literatureĀ [4, 5, 6, 9, 13, 23, 24], the simpler (unweighted) case still presents a significant challenge for researchers, and solving it quickly is of paramount importance to all variants. Max Cut variants have many applications, including social network modelingĀ [23], statistical physicsĀ [4], portfolio risk analysisĀ [24], VLSI designĀ [6, 9], network designĀ [5], and image segmentationĀ [13].
Theoretical approaches to solving Max Cut primarily focus on producing efficient parameterized algorithms through data reduction rules, which reduce the input size in polynomial time while maintaining the ability to compute an optimal solution to the original input. If the resulting (irreducible) graph has size bounded by a function of a given parameter, then it is called a kernel. Recent works focus on parameters measuring the distance between the maximum cut size of the input graph and a lower bound guaranteed for all graphs. The algorithm then must decide if the input graph admits a cut of sizeĀ for a given integer . Two such lower bounds are the Edwards-ErdÅs boundĀ [15, 16] and the spanning tree bound. Crowston et al.Ā [11] were the first to show that unweighted Max Cut is fixed-parameter tractable when parameterized by distance above the Edwards-ErdÅs bound. Moreover, they show the problem admits a polynomial-size kernel with vertices. Their result was extended to the more general Signed Max Cut problem, and the kernel size was decreased to verticesĀ [10]. Finally, Etscheid and MnichĀ [17] improved the kernel size to an optimal vertices even for signed graphs, and showed how to compute it in linear time .
Many practical approaches exist to compute a maximum cut or (alternatively) a large cut. Two state-of-the-art exact solvers are Biq Mac (a solver for binary quadratic and Max-Cut problems) by Rendl et al.Ā [31], and LocalSolverĀ [8, 22], a powerful generic local search solver that also verifies optimality of a cut. Many heuristic (inexact) solvers are also available, including those using unconstrained binary quadratic optimizationĀ [35], local searchĀ [7], tabu searchĀ [27], and simulated annealingĀ [3].
Curiously, data reduction, which has shown promise at preprocessing large instances of other fundamental -hard problemsĀ [2, 25, 28], is currently not used in implementations of Max Cut solvers. To the best of our knowledge, no research has been done on the efficiency of data reduction for Max Cut, in particular with the goal of achieving small kernels in practice.
Our Results. We introduce new data reduction rules for the Max Cut problem, and show that nearly all previous reduction rules for the Max Cut problem can be encompassed by only four reduction rules. Furthermore, we engineer efficient implementations of these reduction rules and show through extensive experiments we show that kernelization achieves a significant reduction on sparse graphs. Our experiments reveal that current state-of-the-art solvers can be sped up by up to multiple orders of magnitude when combined with our data reduction rules. We achieve speedups on all instances tested. On social and biological networks in particular, kernelization enables us to solve four instances that were previously unsolved in a ten-hour time limit with state-of-the-art solvers; three of these instances are now solved in less than two seconds with our kernelization.
2 Preliminaries
Throughout this paper, we consider finite, simple and undirected graphs together with additive edge weight functions . For each vertex let denote its neighbors; its degree in is . The neighborhood of a set is . For a vertex set , let denote the subgraph of induced by . To specify the vertex and edge sets of a specific graph , we use and , respectively. The set of edges between the vertices of different vertex sets is written as .
For an integer , a path of length in is a sequence of distinct vertices such that for . A path with is called a cycle of . Graph is connected if there is a path from to for any pair of distinct vertices in ; and disconnected otherwise. A connected component ofĀ is an inclusion-maximal connected subgraph of . For vertex sets , the set of external vertices is , which is the set of vertices in that have some neighbor in outside . In similar fashion,Ā defines the set of internal vertices.
A clique is a complete subgraph, and a near-clique is a clique minus a single edge. A clique tree is a connected graph whose biconnected components are cliques, and a clique forest is a graph whose connected components are clique trees. In such graphs, we use the term block to refer to a biconnected component, bridge, or isolated vertex. The class of clique-cycle forests is defined as follows. A clique is a clique-cycle forest, and so is a cycle. The disjoint union of two clique-cycle forests is a clique-cycle forest. In addition, a graph formed from a clique-cycle forest by identifying two vertices, each from a different (connected) component, is also a clique-cycle forest.
The Max Cut problem is to find a vertex set , such that is maximized. We denote the cardinality of a maximum cut by . At times, we may need to reason about a maximum cut given a fixed partitioning of a subset of ās vertices. A partition of vertices is given as a -coloring . We let denote the size of a maximum cut of , given that is partitioned according to . The Weighted Max Cut problem is to find a vertex set of a given graph with additive weight function such that is maximum. The weight of a maximum cut is then given by . We denote instances of theĀ Max Cut decision problem as , where is a graph and , If the size of a maximum cut in is , then is a āyesā-instance; otherwise, it is a ānoā-instance.
We address two more variations Max Cut in this paper. The Vertex-Weighted Max Cut problem takes as input a graph and two vertex weight functions ; the objective is to compute a bipartitionĀ that maximizes . The Signed Max Cut problem takes as input a graph together with an edge labelingĀ l:E(G)\rightarrow\{``\scalebox{1.0}[1.0]{+}",``\scalebox{0.75}[1.0]{-}"\}; the goal to find an which maximizes the quantityĀ , whereĀ and forĀ c\in\{``\scalebox{0.75}[1.0]{-}",``\scalebox{1.0}[1.0]{+}"\}. Similarly, for the neighborhood of a vertex (set), we use the notationsĀ and . We call a triangle positive if its number of ā
ā-edges is even. Any Max Cut instance can be transformed into a Signed Max Cut instance by labeling all edges with ā
ā.
Let denote the set of input instances for a decision problem. A parameterized problem is fixed-parameter tractable if there is an algorithmĀ (called a fixed-parameter algorithm) that decides membership in for any input pair in timeĀ for some computable function .
A data reduction rule (often shortened to reduction rule) for a parameterized problem is a function that maps an instance of to an equivalent instance of such that is computable in time polynomial in andĀ . We call two instances of equivalent if either both or none belong to . Observe that for two equivalent āyesā-instances and , the relationshipĀ holds for some .
2.1 Related Work
Several studies have been made in the direction of providing fixed-parameter algorithms for the Max Cut problem [10, 11, 17, 29]. Among these, a fair amount of kernelization rules have been introduced with the goal of effectively reducing Max Cut instances [10, 11, 17, 29, 30, 18]. Those reductions typically have some constraints on the subgraphs, like being clique forests or clique-cycle forest. Later, we propose a new set of reductions that does not need this property and cover most of the known reductions Ā [11, 17, 29, 18]. There are other reductions rules that are fairly simplistic and focus on very narrow casesĀ [30]. We now explain the Edwards-ErdÅs bound and the spanning tree bound.
Edwards-ErdÅs Bound.
For a connected graph, the Edwards-ErdÅs bound [15, 16] is defined as . A linear-time algorithm that computes a cut satisfying the Edwards-ErdÅs bound for any given graph is provided by Van Ngoc and TuzaĀ [34]. The Max Cut Above Edwards-ErdÅs (Max Cut AEE) problem asks for a graph and integer if admits a cut of sizeĀ . All kernelization rules for Max Cut AEE require a set set such that is a clique forest. Etscheid and MnichĀ [17] propose an algorithm that computes such a set of at most vertices in time .
Spanning Tree Bound. Another approach is based on utilizing the spanning forest of a graph [29]. For a given , a Max Cut of size is searched for. This decision problem is denoted as Max Cut AST (Max Cut Above Spanning Tree). For sparse graphs, this bound is larger than the Edwards-ErdÅs bound. The reductions for the problem require a set such that is a clique-cycle forest.
3 New Data Reduction Rules
We now introduce our new data reduction rules and prove their correctness. The main feature of our new rules is that they do not depend on the computation of a clique-forest to determine if they can be applied. Furthermore, our new rules subsume almost all rules from previous worksĀ [10, 11, 17, 29, 18] with the exception of Reduction RulesĀ 10 andĀ 11 by Crowston et al.Ā [10]. We provide details in [19]. For an overview of how rules are subsumed, consult TableĀ 1. Hence, our algorithm will only apply the rules proposed in this section. We provide proofs for the rules that proved most useful in our experimental evaluation.
**Reduction RuleĀ 1. ** Let be a graph and let induce a clique in . IfĀ , then for .
Proof.
Note that any partition of the clique into two vertex sets of size and is a maximum cut of . Suppose we fix the partitions of the at most external vertices of . Then the at least internal vertices can be assigned to the partitions so they each contain and vertices. Thus, regardless of how is partitioned, the size of a maximum cut of remains the same. ā
We can exhaustively apply Reduction RuleĀ 1 inĀ time by scanning over all vertices in the graph. When scanning vertex , we check whether induces a clique. This finds all cliques with at least one internal vertex. Checking whether Reduction RuleĀ 1 is applicable is then straightforward by counting the number of vertices with degree higher than the size of the clique.
**Reduction RuleĀ 2. ** Let be an induced -path in a graph with and . Construct from by adding a new edge and removing the vertices and . Then .
Proof.
Let and let be an assignment of vertices to the partitions of a cut in . We distinguish two cases:
- ā¢
Case : If , then no edges of are cut. Notice that this cut is not maximum since moving between partitions increases the cut size by two. If , then exactly two edges in are cut.
- ā¢
Case : By choosing and , all three edges in are cut. In , the edge between and is cut, so . ā
**Reduction RuleĀ 3. ** Let be a graph and let induce a near-clique inĀ . Let be the graph obtained from by adding the missing edge so that induces a clique in . If is odd or , thenĀ .
Proof.
Let be the edge added to the graph and any 2-coloring of . We show that a maximum cut ofĀ exists such that and are in the same partition. As has one less edge than , this means thatĀ , which implies that .
Define for . Without loss of generality, assume . Note that, given the partition for , maximizing the cut of means minimizing . We distinguish three cases:
- ā¢
: By adding and to , decreases. The rest of the internal vertices have to be distributed among and such that is minimized
- ā¢
: By adding and to , stays . If is odd, then is the minimal value possible and is even. So the remaining internal vertices can be distributed evenly between and . If is even, then an odd number of internal vertices are left (and at least one by the definition of the rule) which can be distributed to balance and .
- ā¢
: By adding and to , becomes 2. If is odd, then an odd number of internal vertices is left to assign to such that becomes 1. If is even then there is an even number of internal vertices left which can be distributed to balance and .ā
Since some cliques are irreducible by currently known rules, it may be beneficial to also apply Reduction RuleĀ 1 āin reverseā. Although this āreverseā reduction neither reduces the vertex set nor (as our experiments suggest) lead to applications of other rules, it can undo unfruitful additions of edges made by Reduction RuleĀ 1 and may remove other edges from the graph.
**Reduction RuleĀ 4. ** Let be a graph and let induce a clique in . If is odd or , an edge between two vertices of is removable. That is, for , .
Proof.
Follows from the correctness of Reduction RuleĀ 1. ā
The following reduction rule is closely related to the upcoming generalization of Reduction Rule 8 by Crowston et al.Ā [10]. It is able to further reduce the case where for a clique ofĀ . In comparison, the generalization of Reduction Rule 8 from [10] is able to handle the case . Due to the degree by which these rules are similar, they are also merged together in our implementation, as the techniques to handle both are the same.
**Reduction RuleĀ 5. ** Let induce a clique in a graph , whereĀ and for all . Create fromĀ by removing an arbitrary vertex of . Then .
Proof.
Let and be any 2-coloring of . Note that ā the removal of disconnects from the remainder of the graph.
Define and for . We distribute the vertices in among and such that is maximized. Notice that every vertex in is connected to all other vertices inĀ . The size of any cut is thereforeĀ , where and denote the number of vertices from that we want to insert intoĀ and , respectively. This can be rewritten as . As all other parts are constant, this reduces to maximizing . As is constant, is maximized when is minimized.
Because , it is always possible to distribute the vertices of such that , which then maximizes . Removing any vertex from will change the cut byĀ : without loss of generality, let . Then is odd and , which maximizes the cut. Then,Ā . ā
The following algorithm identifies all candidates of Reduction RuleĀ 1 in linear time. First, we order the adjacencies of all vertices. That is, for every vertex , the vertices in are sorted according to a numeric identifier assigned to every vertex. For this, we create an auxiliary array of empty lists of size . We then traverse the vertices for every vertex and insert each pair in a list identified by indexing the auxiliary array with . We then iterate once over the array from the lowest identifier to the highest and recreate the graph with sorted adjacencies. In total, this process takes time.
For any clique of , we have to check if for all pairs of vertices from that holds (neighborhood condition). Our algorithm uses tries [20, 12] to find all candidates. A trie supports two operations, Insert(key,val) and Retrieve(key). The key parameter is an array of integers and val is a single integer. Function Retrieve returns all inserted values by Insert that have the same key. Internally, a trie stores the inserted elements as a tree, where every node corresponds to one integer of the key and every prefix is stored only once. That means that two keys sharing a prefix share the same path through the trie until the position where they differ.
For each vertex , we use the ordered set as key and as the val parameter. Notice that is already sorted. The key can be then computed through an insertion of into the sequence in time . After Insert(,) is done for every vertex , each trie leaf contains all vertices that satisfy the condition of Reduction RuleĀ 1. Meaning, for every vertex pair of a trie leaf, the neighborhood condition is met. We then verify whether the vertex set of a leaf is a clique, inĀ time. As each such setĀ is considered exactly once and the graph is fully partitioned, this requiresĀ time in total. As a last step, we check whether by using the observation that . In Sect.Ā 4, we describe a timestamping system that assists the above procedure in not having to repeatedly check the same structures after any amount of vertices and edges are added or removed from . However, in those later applicability checks, we disregard sorting the adjacencies of all vertices in linear time again. Rather we simply use a comparison based sort on the adjacencies.
The next reduction rule is our only rule whose application turns unweighted instances into instances of Weighted Max Cut. Our experiments show that this can reduce the kernel size significantly. This is noteworthy, given that existing solvers for Max Cut usually support weighted instances.
**Reduction RuleĀ 6. ** Let be a graph, a weight function, and be an induced 2-path with . Let be the edge between vertex and ; letĀ be the one between and . Construct from by deleting vertex and adding a new edgeĀ with . Then .
Proof.
Let be a maximum cut of and consider the following two cases:
- ā¢
: If , then . Otherwise, . In total, the path contributes to the cut. in , the edge between and is not cut, so .
- ā¢
: If , then . Otherwise, . In total, the path contributes to the cut. In , the edge between and is cut and contributes to the cut, so again .ā
Our next two rules (Reduction RulesĀ 1 andĀ 1) generalize Reduction Rule 8 by Crowston et al.Ā [10], which we restate for completeness.
Reduction RuleĀ 8. ([10], Reduction RuleĀ 8)
Let be a signed graph, a set of vertices such that is a clique forest, and a block in . If there is aĀ such that , andĀ for all . Construct the graph from by removing any two vertices , then .
Note that, for unsigned graphs, andĀ for every vertex .
Here, different choices ofĀ lead to different applications of this rule. Our generalizations do not require such a set anymore and can find all possible applications for any choice of .
Reduction RuleĀ 1w=1.
Let be the vertex set of a clique in with and for all . Construct the graph by deleting two arbitrary verticesĀ from . ThenĀ .
We show the correctness of Reduction RuleĀ 1 by reducing it to Reduction Rule 8 by Crowston et al.Ā [10].
Proof.
Let and . Since is a clique, is a clique forest. From it follows that . Also, and , so all conditions for Reduction RuleĀ 1 are satisfied.
It remains to show that . Note that and . By Reduction RuleĀ 1, we know that , therefore we have that
[TABLE]
WhereĀ (1) follows from . ā
**Reduction RuleĀ 7. ** Let induce a clique in a signed graph such thatĀ \forall e\in E(X):l(e)=``\scalebox{0.75}[1.0]{-}" and , , andĀ for all . Construct by deleting two arbitrary verticesĀ from . Then .
Proof (Sketch)..
The proof for this rule is almost identical to the proof of Reduction RuleĀ 1. ā
Using an almost equivalent approach as we did for Reduction RuleĀ 1, we can find all candidates of this reduction rule in linear time.
In order to also reduce weighted instances to some degree, we use a simple weighted scaling of two reduction rules. That is, we extend their applicability from an unweighted subgraph to a subgraph where all edges have the same weight . We do this for Reduction RulesĀ 1 andĀ 1.
Reduction RuleĀ 1w=c.
Let be a weighted graph and let induce a clique with for every edge for some constant . Let with for every . IfĀ , then .
Reduction RuleĀ 1w=c.
Let be a weighted graph and let induce a near-clique in . Furthermore, let for every edgeĀ for some constant . Let be the graph obtained from by adding the edge so that induces a clique in . Set , andĀ forĀ . If is odd or , thenĀ .
4 Implementation
4.1 Kernelization Framework
We now discuss our overall kernelization framework in detail. Our algorithm begins by generating an unweighted instance by replacing every weighted edge by an unweighted subgraph with a specific structure. Afterwards, we apply our full set of unweighted reduction rules: 1, 1 (together with 1), 1, and 1. As already mentioned earlier, Reduction RuleĀ 1 is the unweighted version of 1. We then create a signed instance of the graph by exhaustively executing weighted path compression using Reduction RuleĀ 1 with the restriction that the resulting weights are or . We then exhaustively apply Reduction RuleĀ 1. Once the signed reductions are done, we apply Reduction RuleĀ 1 to fully compress all paths into weighted edges. This is then succeeded by Reduction RuleĀ 1 and 1. We then transform the instance into an unweighted one and apply Reduction RuleĀ 1 in order to avoid cyclic interactions between itself and Reduction RuleĀ 1. Finally, if a weighted solver is to be used on the kernel, we exhaustively perform Reduction RuleĀ 1 to produce a weighted kernel. Note that different permutations of the order in which reduction rules are applied can lead to different results.
4.2 Timestamping
Next we describe how to avoid unnecessary checks for the applicability of reduction rules. For this purpose, let the time of the most recent change in the neighborhood of a vertex be and let the variable describe the current time. Initially, and . Every time a reduction rule performs a change on , setĀ and increment . For each individual Reduction Rule , we also maintain a timestamp (initialized with [math]), indicating the upper bound up to which all vertices have already been processes. Hence, all vertices with do not need to be checked again by Reduction RuleĀ . Note that timestamping only works for ālocalā reduction rulesāthe rules whose applicability can be determined by investigating the neighborhood of a vertex. Therefore, we only use this technique for Reduction RulesĀ 1 and 1.
5 Experimental Evaluation
5.1 Methodology and Setup
All of our experiments were run on a machine with four Octa-Core Intel Xeon E5-4640 processors running at 2.40GHz CPUs with GB of main memory. The machine runs Ubuntu 18.04. All algorithms were implemented in C++ and compiled using gcc versionĀ 7.3.0 with optimization flag -O3. We use the following state-of-the-art Weighted Max Cut solvers for comparisons: the exact solvers LocalSolverĀ [8] (heuristically finds a large cut, and can then verify if it is maximum), Biq MacĀ [31] as well as the heuristic solver MqLibĀ [14]. MqLib is unable to determine on its own when it reaches a maximum cut and always exhausts the given time limit. We also evaluated an implementation of the reduction rules used by Etscheid and MnichĀ [17]; however, preliminary experiments indicated that it performs worse than current state-of-the-art solvers. In the following, for a graph , denotes the graph after all reductions have been applied exhaustively. For this purpose, we examine the following efficiency metric: we denote the kernelization efficiency by . Note that is when all vertices are removed after applying all reduction rules, and [math] if no vertices are removed.
For our experiments we use four different datasets: First, we use random instances from four different graph models that were generated using the KaGen graph generatorĀ [21, 33]. In particular, we used ErdÅs-RĆ©nyiĀ graphs (GNM), random geometric graphs (RGG2D), random hyperbolic graphs (RHG) and BarabĆ”si-Albert graphs (BA). The main purpose of these instances is to study the effectiveness of individual reduction rules for a variety of graph densities and degree distributions. To analyze the practical impact of our algorithm on current-state-of-the-art solvers we use a selection of sparse real-world instances by Rossi and AhmedĀ [32], as well as instances from VLSI design (g00*) and image segmentation (imgseg-*) by Dunning et al.Ā [14]. Note that the original instances by Dunning et al.Ā [14] use floating-point weights that we scaled to integer weights. Finally, we evaluate denser instances taken from the rudy category of the Biq Mac LibraryĀ [1]. We further subdivide these instances into medium- and large-sized instances.
5.2 Performance of Individual Rules
To analyze the impact of each individual reduction rule, we measure the size of the kernel our algorithm procedures before and after their removal. Fig.Ā 1 shows our results on RGG2D and GNM graphs with vertices and varying density. We have settled on those two types of graphs as they represent different ends on the spectrum of kernelization efficiency. In particular, kernelization performs good on instances that are sparse and have a non-uniform degree distribution. Such properties are given by the random geometric graph model used for generating the RGG2D instances. Likewise, kernelization performs poor on the uniform random graphs that make up the GNM instances. We excluded Reduction RuleĀ 1 from these experiments as it only removes edges and thus leads to now difference in the kernelization efficiency.
Looking at Fig.Ā 1, we can see that Reduction RuleĀ 1 gives the most significant reduction in size. Its absence always diminishes the result more than any other rule. In particular, we see a difference in efficiency of up to (RGG2D) and (GNM) when removing Reduction RuleĀ 1. The second most impactful rule for the RGG2D instances is Reduction RuleĀ 1 with a difference of only up to . For the GNM instances Reduction RuleĀ 1 is second with a difference of up to . However, note that Reduction RulesĀ 1 and 1 lead to no difference in efficiency on these instances. Thus, we can conclude that depending on the graph type, different reduction rules have varying importance. Furthermore, our simple Reduction RuleĀ 1 seems to have the most significant impact on the overall kernelization efficiency. Note that this is in line with the theoretical results from TableĀ 1, which states that Reduction RuleĀ 1 covers most of the previously published reduction rules and Reduction RuleĀ 1 still covers many but less rules from previous work.
5.3 Exactly Computing a Maximum Cut
To examine the improvements kernelization brings for medium-sized instances, we compare the time required to obtain a maximum cut for both the kernelized and the original instance. We performed these experiments using both LocalSolver and Biq Mac. Note that we did not use MqLib as it is not able to verify the optimality of the cut it computes. The results of our experiments for our set of real-world instances are given in TableĀ 2 (with weighted path compression) and TableĀ 3 (without weighted path compression). Since the image segmentation instances are already weighted, they are omitted from TableĀ 3. It is noteworthy that we do not include the results for the rudy instances from the Biq Mac library. These instances feature a uniform edge distribution and an overall average degree of at least . Our preliminary experiments indicated that kernelization provides little to no reduction in size for these instances. Therefore, we omit them from further evaluation and focus on more sparse graphs.
First, we notice that kernelization is able to provide moderate to significant speedups for all instances that we have tested. In particular, we are able to a speedup between and for instances that were previously solvable by LocalSolver. Likewise, for the instances that Biq Mac is able to process, we achieve a speedup of up to three orders of magnitude. Furthermore, we allow these solvers to now compute a maximum cut for a majority of instances that have previously been infeasible in less than minutes.
To examine the impact when allowing a weighted kernel, we now compare the performance our algorithm using weighted path compression (TableĀ 2) with the unweighted version (TableĀ 3). We can see that by including weighted path compression we can achieve significantly better speedups, especially for the sparse real-world instances by Rossi and AhmedĀ [32]. For example, on ego-facebook we achieve a speedup of with compression and without.
Finally, it is also noteworthy that we get significant improvements for the weighted instances from VLSI design and image segmentation. By examining the performance of each individual reduction rule, we can see that this is solely due to Reduction RuleĀ 1. These findings could improve the work by de Sousa et al.Ā [13], which also affects the work by Dunning et al.Ā [14]. In conclusion, our novel reduction rules give us a simple but powerful tool for speeding up existing state-of-the-art solvers for computing maximum cuts. Moreover, as mentioned previously, even our simple weighted path compression by itself is able to have a significant impact.
5.4 Analysis on Large Instances
We now examine the performance of our kernelization framework and its impact on existing solvers for large graph instances with up to millions of vertices. For this purpose, we compared the cut size over time achieved by LocalSolver and MqLib with and without our kernelization. Note that we did not use Biq Mac as it was not able to handle instances with more than 3ā000 vertices. Our results using a three-hour time limit for each solver are given in TableĀ 4. Furthermore, we present convergence plots in Fig.Ā 2.
First, we note that the time to compute the actual kernel is relatively small. In particular, we are able to compute a kernel for a graph with million vertices and edges in just over six minutes. Furthermore, we achieve an efficiency between and across all tested instances. When looking at the convergence plots (Fig.Ā 2) we can observe that the additional preprocessing time of kernelization is quickly compensated by a significantly steeper increase in cut size compared to the unkernelized version. Furthermore, for instances where a kernel can be computed very quickly, such as web-google, we find a better solution almost instantaneously. In general, the results achieved by kernelization followed by the local search heuristic are always better than just using the local search heuristic alone. However, the final improvement on the size of the largest cut found by LocalSolver and MqLib is generally small for the given time limit of three hours.
6 Conclusions
We engineered new efficient data reduction rules for Max Cut and showed that these rules subsume most existing rules. Our extensive experiments show that kernelization has a significant impact in practice. In particular, our experiments reveal that current state-of-the-art solvers can be sped up by up to multiple orders of magnitude when combined with our data reduction rules.
Developing new reduction rules is an important direction for future research. Of particular interest are reduction rules for Weighted Max Cut, where reduction rules yield a weighted kernel.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Biq Mac Library. http://biqmac.aau.at/biqmaclib.html , 2018. [Online; accessed 2-September-2018].
- 2[2] Faisal N. Abu-Khzam, Michael R. Fellows, Michael A. Langston, and W. Henry Suters. Crown structures for vertex cover kernelization. Theory Comput. Syst. , 41(3):411ā430, 2007. doi:10.1007/s 00224-007-1328-0 . Ā· doiĀ ā
- 3[3] Emely ArrĆ”iz and Oswaldo Olivo. Competitive simulated annealing and tabu search algorithms for the Max-Cut problem. In Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation , GECCO ā09, pages 1797ā1798, New York, NY, USA, 2009. ACM. doi:10.1145/1569901.1570167 . Ā· doiĀ ā
- 4[4] Francisco Barahona. On the computational complexity of Ising spin glass models. J. Phys. A: Mathematical and General , 15(10):3241, 1982. doi:10.1088/0305-4470/15/10/028 . Ā· doiĀ ā
- 5[5] Francisco Barahona. Network design using cut inequalities. SIAM J. Optim. , 6(3):823ā837, 1996. doi:10.1137/S 1052623494279134 . Ā· doiĀ ā
- 6[6] Francisco Barahona, Martin Grƶtschel, Michael Jünger, and Gerhard Reinelt. An application of combinatorial optimization to statistical physics and circuit layout design. Oper. Res. , 36(3):493ā513, 1988. doi:10.1287/opre.36.3.493 . Ā· doiĀ ā
- 7[7] Una Benlic and Jin-Kao Hao. Breakout local search for the Max-Cut problem. Engineering Applications of Artificial Intelligence , 26(3):1162ā1173, 2013. doi:10.1016/j.engappai.2012.09.001 . Ā· doiĀ ā
- 8[8] Thierry Benoist, Bertrand Estellon, FrĆ©dĆ©ric Gardi, Romain Megel, and Karim Nouioua. Localsolver 1.x: a black-box local-search solver for 0-1 programming. 4OR , 9(3):299, 2011. [used in this work: Localsolver 8.0]. URL: https://www.localsolver.com/ , doi:10.1007/s 10288-011-0165-9 . Ā· doiĀ ā
