Rounding semidefinite programs for large-domain problems via Brownian motion
Kevin L. Chang, Alantha Newman

TL;DR
This paper introduces a novel rounding method for semidefinite programming relaxations in large-domain problems, leveraging Brownian motion to improve approximate solutions for angular synchronization.
Contribution
It proposes a simple, Brownian motion-based rounding scheme for SDP relaxations, specifically applied to angular synchronization problems, with conjectured near-optimal guarantees.
Findings
The rounding scheme is feasible and effective based on computational evidence.
It achieves approximation guarantees close to the best possible under the Unique-Games Conjecture.
The method simplifies the rounding process for large-domain SDP problems.
Abstract
We present a new simple method for rounding a semidefinite programming relaxation of a constraint satisfaction problem. We apply it to the problem of approximate angular synchronization. Specifically, we are given directed distances on a circle (i.e., directed angles) between pairs of elements and our goal is to assign the elements to positions on a circle so as to preserve these distances as much as possible. The feasibility of our rounding scheme is based on properties of the well-known stochastic process called Brownian motion. Based on computational and other evidence, we conjecture that this rounding scheme yields an approximation guarantee that is very close to the best-possible guarantee (assuming the Unique-Games Conjecture).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optimization Algorithms Research · Complexity and Algorithms in Graphs · Constraint Satisfaction and Optimization
Rounding semidefinite programs for large-domain problems
via Brownian motion
Kevin L. Chang Research done at Max-Planck-Institut für Informatik, Saarbrücken, Germany. [email protected].
Alantha Newman Research done at Max-Planck-Institut für Informatik, Saarbrücken, Germany. CNRS-Université Grenoble Alpes. [email protected].
Abstract
We present a new simple method for rounding a semidefinite programming relaxation of a constraint satisfaction problem. We apply it to the problem of approximate angular synchronization studied in [MN11]. Specifically, we are given directed distances on a circle (i.e., directed angles) between pairs of elements and our goal is to assign the elements to positions on a circle so as to preserve these distances as much as possible. The feasibility of our rounding scheme is based on properties of the well-known stochastic process called Brownian motion. Based on computational and other evidence, we conjecture that this rounding scheme yields an approximation guarantee that is very close to the best-possible guarantee (assuming the Unique-Games Conjecture).
1 Introduction
We present an alternate approach to the Relaxed Linear Equations (Rel-Lin-Eq) problem studied in [MN11]. Our new approach is also based on rounding a semidefinite programming relaxation but uses a different rounding technique. Based on computational evidence and other justification, we believe this approach has essentially the same approximation guarantee of for Rel-Lin-Eq as proven for a different algorithm presented in [MN11].
We are given a set of equations in the form of . Let be the set of elements and let . We assign each element in a (integral) value from the set . For a fixed assignment, an equation has value , for . Since can have a nonnegative value of at most , we divide by in order to obtain a normalized value between 0 and 1. Our goal is to find an assignment that maximizes the sum . More precisely, we formulate our objective function as follows.
[TABLE]
This problem generalizes the Max-Cut problem: For any edge in a graph, we write the constraint . Then an -approximation to Rel-Lin-Eq yields an -approximation for Max-Cut. It can be viewed as an approximate version of the angular synchronization problem studied by Singer [Sin11]. Originally, our motivation was to develop rounding methods for constraint satisfaction problems whose solutions to assignment-constraint based SDPs can have the property that no pair of assignment vectors have a high dot product despite the solution having a high objective value. (e.g., A problem with this property is Maximum Acyclic Subgraph, which has since been proven to be Unique-Games hard to approximate to within a factor greater than [GHM*+*11]. On the other hand, Unique Games does not have this property.)
Our rounding scheme is based on properties of the well-known stochastic process called Brownian motion. Theoretically, this procedure could be applied to other problems that can be modeled using the standard assignment-constraint based SDP framework (i.e., SDP formulation in Section 2.1). However, it does seem tailor-made for our particular objective function. It is also reminiscent of “sticky” random walks used in constructive approaches to discrepancy minimization [Ban10], although these results focus on assigning each element a binary value and one of our main motivations was to study how to approximate large-domain problems.
1.1 Organization
After presenting our quadratic formulations and relaxations in Section 2, we present our rounding procedure in Section 3. In Section 4, we discuss Brownian motion and how it relates to our rounding procedure. Then we prove that our rounding procedure is feasible most of the time; precisely, at least of the time, it assigns a (non-random) position to a variable. First, we prove this for a continuous process (Section 5) and then for a discrete process (Section 6). Finally, in Section 7, we state a conjecture regarding the correlation of two random walks, which is supported by extensive computational experiments. A positive resolution to this conjecture would be one way to prove that this rounding procedure has a guarantee close to the best-known guarantee of [MN11] (and close to the best-possible guarantee of under the Unique-Games Conjecture).
2 Quadratic Programs
For each variable , we have a set of unit vectors, , for a total of vectors. For , let . Note that . Let denote a particular (fixed) set of vectors with the property that for , . For example, if , then the set can be the following eight vectors.
[TABLE]
This formula for creating such a set of vectors can be generalized for any even value of (where the absolute value of each coordinate of each vector is ). We obtain the following quadratic program for Rel-Lin-Eq. Let denote the set of integers in .
A Quadratic Program ():
(1)
(2)
(3)
(4)
For each variable , the corresponding set of vectors has the same configuration up to rotation, reflection and translation. This is enforced by Constraints (1) and (3). In an integral solution, the set of vectors corresponding to variable is identical to the set of vectors corresponding to the variable , for all variables . This follows from the fact that all vectors belong to . The only difference is that vectors in the two sets may have different labels (i.e., one set of vectors can be viewed as a rotation of the other set). Then the relative values or positions of two variables only depends on the rotations of the labels. In other words, if for variables and , the same vectors have the same labels, then the two variables will be assigned to the same position. If and is in position , should be in position . Given an integral solution, we can determine the position of each variable by picking a vector, and assigning each variable to the label to which that vector corresponds for that variable.
2.1 Semidefinite Relaxations
To obtain a semidefinite relaxation of (), we remove Constraint (4) and require only that each . Note that even in the semidefinite relaxation, the set of vectors corresponding to a particular variable have the same configuration for each variable up to rotation, reflection and translation. We refer to the set of vectors corresponding to a variable as a constellation . We show that certain properties hold for each constellation.
A Semidefinite Program ():
(5)
(6)
(7)
(8)
Let be a vector in in which each entry is . Let be the indicator vector which has a in the position and 0 elsewhere. For such that , we define as follows:
[TABLE]
Definition 1**.**
Let the constellation be the set of unit vectors defined as follows. For such that , define . For such that , let .
Lemma 1**.**
For any , the constellation is equivalent to the constellation up to rotation, reflection and translation.
Proof.
Let . Without loss of generality, let us assume that for . We can assume this since for all , we have (i) (Lemma 2) and (ii) for all such that (Lemma 3).
Note that . This implies that and that . Thus, there is some rotation of the vectors in such that the resulting set of vectors is equivalent to .∎
Lemma 2**.**
For all and , .
Proof.
[TABLE]
∎
Lemma 3**.**
For and for , the vectors if and are non-overlapping intervals.
Proof.
[TABLE]
This equals 0 if the intervals and are non-overlapping, since then .∎
Another way to write a semidefinite program is to use a standard formulation based on assignment constraints (e.g., see Quadratic Program in [MN11]).
A Semidefinite Program ():
(9)
(10)
(11)
(12)
(13)
(14)
Given a solution for , we can construct a solution for as follows.
[TABLE]
It is not difficult to see that the transformation in (15) preserves the objective value. In our computational experiments, we used solutions for , which are more constrained than solutions for (e.g., Constraint (9) is not implied by the constraints in ). However, we feel it is somewhat clearer to present the rounding algorithm in the next section based on a solution for .
2.2 Relaxation on an Arbitrarily Large Domain
Note that we can replace with an arbitrarily large constant . Suppose is a multiple of (i.e., ). Then we can scale each constraint so that becomes . The optimal objective value of the original and the scaled problem are the same. Moreover, given a solution for on the domain of size , we can create a solution for the scaled problem on the domain of size with the same objective value without resolving the relaxation . Thus, we can assume that is an extremely large constant. We assume this since our rounding algorithms work best on a large domain.
3 Rounding the Relaxation
Our algorithm for Rel-Lin-Eq is based on rounding a solution for the semidefinite relaxation () presented in Section 2.1. The first issue is, how do we use the constellation of vectors to determine the position or value of variable ? We will consider the following random process with steps. Let be a random vector in which each coordinate is chosen according to the normal distribution . We can view the values as a discrete random process in which the expected correlation of and is given by the dot product .
Let us view these values as a discrete random process on the interval . For a subinterval , we say time step is in the interval if and .
Given such a random process, we say that there is an extreme sign change with threshold between times and if , and for all . Our algorithm is based on the observation that in this random process, it is very likely that there is exactly one extreme sign change for the threshold (i.e., there do not exist two disjoint intervals that both contain extreme sign changes). This is stated in Theorem 1. If this random process has exactly one extreme sign change, then we say that the process first reaches a threshold at time if there is an interval such that , , and for all . Note that these intervals are taken modulo (i.e., they are intervals on a circle).
Definition 2**.**
An extreme sign change with threshold in the sequence occurs when and and for no value of is .
Theorem 1**.**
If is a sufficiently large constant, then with probability at least , the random process with steps has exactly one extreme sign change with threshold 1.
3.1 Rounding Algorithm
Given Theorem 1, we present the following rounding algorithm.
(i)
Solve the semidefinite relaxation ().
(ii)
Choose a random vector where each coordinate for .
(iii)
For each variable , consider the sequence for all .
(a)
If there is one extreme sign change:
Place at the where first reaches threshold 1.
(b)
If there are no extreme sign changes:
Assign a random position in .
For each , we associate the random walk . Each walk has the same expected behaviour. This follows from the fact that for each , there is a rotation matrix such that the set of vectors is equal to a canonical set of vectors, as stated in Lemma 1 (i.e., for a fixed vertex , the pairwise (setwise) relationships of the vectors is exactly prescribed by constraints (2) and (4) of the SDP). We prove Theorem 1 in Section 5. First, we briefly discuss Brownian motion, which we will use in the proof of Theorem 1.
To measure the quality of the solution produced by this rounding algorithm, one must analyze the correlation of two random walks. In Section 7, we state a conjecture regarding this correlation, which is supported by extensive computational experiments.
4 Brownian Motion
In order to analyze the randomized rounding scheme presented above, we will interpret the sequence of vectors corresponding to any fixed variable, as a random walk. We will show that this random walk is a discrete sampling of a fundamental continuous stochastic process called Brownian motion. We will then use properties of Brownian motion to prove properties of our discrete walk. For background in Brownian motion, we refer the reader to the textbook [KS88].
A stochastic process for is a Brownian motion if it satisfies the following properties:
For times , . 2. 2.
For all choices of times , is independent of for all choices of . 3. 3.
. 4. 4.
has continuous sample paths with probability 1.
Our proofs will rely on two basic properties of Brownian motion: the distributions of hitting times and the Reflection Principle.
The first hitting time for level , denoted by , is defined to be the first time at which takes the value : . is a random variable whose distribution has the density function:
[TABLE]
Roughly stated, the Reflection Principle is the intuitive property that once a Brownian motion hits a level , it is equally likely to be above and below the level in the future. More precisely, it states that if is a Brownian motion and its hitting time to , then the process defined by
[TABLE]
(which is the formula for “reflected” about the horizontal line after it hits ) is also a Brownian motion. We will use the reflection principle many times in our calculations.
4.1 Mapping Our Process to Brownian Motion
Given a constellation of vectors , we can assume by Lemma 1 that the vectors have the following configuration. Each vector in this configuration has dimension . Note that in order to make each vector a unit vector, we can multiply each entry by .
[TABLE]
Suppose that is a vector such that . Let . Define the process as . Thus, we have that:
[TABLE]
Let the vector represent the vector with each entry multiplied by , so that is a set of unit vectors. Let , and define the process as . Thus, we have that:
[TABLE]
Note that when or , it is the case that and , respectively. If we define to be a Brownian motion on the continuous interval , then is a discretization of this continuous process. In the remainder of the paper, when we refer to a particular process , we will drop the superscript when it is clear from the context, or when we are just referring to a single process.
5 Probability of Exactly One Extreme Sign Change
In this section, we we prove Theorem 1. We adopt standard statistical notation and denote density function of a continuous random variable by . For example, if , then . (Heuristically, denotes a very small region around the value .) We will furthermore write to denote the density function of when takes the specific value of 1.
We compute the probability of an event conditioned on a random variable taking value by applying the formula:
[TABLE]
5.1 Probability of at Least One Sign
Change
Suppose Brownian motion begins at and at satisfies . Let be the minimum time that reaches the threshold , and be the minimum time that reaches . Note that and that . These hitting times depend on the value of at time , and are therefore are not stopping times. In order to calculate their distributions, we will first fix and then calculate the distributions of and , conditioned on the value of .
For our proof, we will find it helpful to generalize our definition of and as follows. Suppose a Brownian motion satisfies . Define to be the first time that finishes hitting the following sequence of barriers: , then , then and so forth until it has crossed barriers, alternating between the upper and lower barriers. As an example, would be the first time that the path hits after having first hit , and then . Similarly, define to be the first time that finishes hitting the sequence of barriers: below , above , and so forth until it has crossed barriers.
Lemma 4**.**
.
Our approach will be to consider the decomposition of the total probability into probabilities conditioned on :
[TABLE]
and calculate the integral on the right-hand-side.
We break the domain of the integral into three parts.
5.1.1 Case (i):
In this case, the upper barrier is . The condition implies that crosses this barrier with probability 1 (i.e. ). Therefore,
[TABLE]
5.1.2 Case (ii):
Analogous to Case (i).
5.1.3 Case (iii):
From an application of the inclusion-exclusion principle, note that:
[TABLE]
5.1.4 An example of applying the reflection
principle
We now sketch the reasoning behind a standard calculation involving the reflection principle of Brownian motion and apply it to calculating for the case . We will use this sort of calculation many times in our proofs.
Fix a value of . Suppose a Brownian motion (but not restricted to satisfy ) hits the value at time (i.e., ). Define to be the process for and for . By the reflection principle, the random process is also a Brownian motion.
Now consider the subset of Brownian motions (i.e., they satisfy ). Note that these processes correspond exactly to reflected processes that satisfy . Thus, the elements in the set correspond exactly to elements in the set . Then
[TABLE]
For a more rigorous justification of these calculations, see [Cha01].
Similarly, one can show that: . Therefore
[TABLE]
5.1.5 An example applying the reflection principle twice
Note that the event corresponds to processes that either cross above the barrier then below or vice versa, i.e. processes that satisfy or . Therefore, from the inclusion-exclusion principle we have:
[TABLE]
In order to calculate , we apply the reflection principle twice. First, we reflect about the line when it first hits . Call this reflected process . A process that hits , then hits , then satisfies will correspond exactly to a reflected process that first hits then hits then achieves .
Next, we reflect the process the first time it hits about the line ; call this new process . It is easy to verify that a process (prior to reflection) that hits , then , then achieves will correspond exactly to a reflected process that satisfies . Therefore,
[TABLE]
[I]n order to calculate , we apply the reflection principle twice. After the process first hits and then hits , we reflect the process about the line . Call this reflected process . A process that hits , then hits , then achieves will be reflected to a process that first hits and then achieves . Next, we reflect the process the first time it hits about the line ; call this new process . It is easy to verify that a process (prior to reflection) that hits , then , then achieves will correspond exactly to a twice reflected process that achieves . Therefore,
[TABLE]
Therefore,
[TABLE]
Similarly, one can show
[TABLE]
Note that the event corresponds to the event . From the calcuations in Section 5.6, the following bound can be easily derived:
[TABLE]
Combining these calculations, we arrive at:
[TABLE]
5.2 Totals
Combining the results of the three cases, we obtain:
[TABLE]
5.3 Probability of Three or More Sign Changes
In this section, we prove the following Lemma:
Lemma 5**.**
**
Since the barriers in and depend on the value of , as in the previous section, it will be necessary to decompose the total probability into probabilities conditioned on :
[TABLE]
We partition the domain of the integral into three cases, and calculate the probabilities in each case using the reflection principle.
[A] Brownian motion, , begins at 0 and after time steps achieves the value . Then let be the minimum time that finishes reaching the thresholds in that order. (Let be the min time that reaches the thresholds in that order.) We define another process , which is a reflection of the process over certain thresholds (depending on the case). There are three cases.
5.4 Case (i):
In this case, we only need to calculate the probability that occurs, since if occurs, then must also occur.
To obtain , the process is reflected the first time it hits , then the first time this reflected process hits . Using reasoning similar to Section 5.1.5, it can be shown that if the process (prior to reflection) hits , then , then , then satisfies (i.e., it satisfies for fixed ), then it will correspond exactly to a process that achieves . Therefore,
[TABLE]
Thus, we have:
[TABLE]
5.5 Case (ii):
Analogous to Case (i).
5.6 Case (iii):
By the inclusion-exclusion principle, we have:
[TABLE]
First, we calculate . The process is obtained by reflecting the first time it hits , then the first the reflected process hits , then the first time the twice reflected process hits . If (prior to reflection) hits , then , then , then achieves , then it will correspond exactly to a thrice reflected process that satisfies .
We want to calculate:
[TABLE]
We have:
[TABLE]
Thus, we have:
[TABLE]
Now we calculate . In this case, the process is obtained by reflecting the first time it hits , then the first time the reflected process hits , then the first time the twice reflected process hits . A process that hits , then , then , then satisfies (i.e., it satisfies for fixed ) will correspond exactly to a thrice reflected process that satisfies . We want to compute:
[TABLE]
We have:
[TABLE]
Thus, we have:
[TABLE]
Thus, a naive bound on the probability of three sign changes would be to add expressions (17) and (18):
[TABLE]
The above bound is an overestimate of the probability, because the event is contained in both (17) and (18).
We now calculate . Note that the event occurs when there are at least four sign changes; either or occurs, or possibly both. Using the same argument as we did for , , it can be shown that:
[TABLE]
and that
[TABLE]
Again applying the inclusion-exclusion principle, we have:
[TABLE]
Therefore:
[TABLE]
5.7 Totals
Combining the results of the three cases, we arrive at:
[TABLE]
6 From Brownian Motion to Discrete Random Walks
The randomized rounding procedure for our algorithm involves a discrete random walk; we have proven Lemmas 4 and 5 for the continuous process, Brownian motion. We show in this section that the discretized random walk of the rounding procedure will also satisfy Lemmas 4 and 5.
Suppose is a Brownian motion. As we showed earlier, the discretized random walk of steps, , can be modeled as the sequence: .
First, consider the question of whether Lemma 5 implies that also does not touch the sequence of barriers before time . Certainly, if does not hit this sequence of barriers, then its discretized version also does not hit these three barriers, since . Therefore Lemma 5 holds for the discrete random walk as well.
Now consider the question of whether Lemma 4 implies that hits either of the barriers or . Note that if hits the barrier at time , it is not necessarily true that there exists an such that , since could have hit at some time between the steps of the discretized walk. Therefore, Lemma 4 cannot be immediately adapted to proving properties of the discretized walk. We now prove that the Lemma is true for the discretized walk when the number of steps is a sufficiently large constant.
Recall that random variables and were defined as the first times that the Brownian motion hits the barrier defined by and , respectively. We slightly strengthen these conditions by defining random variables and to be the first times that hits the barriers and , respectively, for some very small constant .
Since will be chosen to be very small, it will not have a large impact on the distributions of and relative to and . The proof of the following lemma involves the same calculations as in the proof of Lemma 4.
Lemma 6**.**
For chosen sufficiently small,
[TABLE]
We use the above Lemma to prove that if the continuous process hits the barrier , then the discrete random walk will hit the barrier with high probability. The case for the barrier is similar.
Lemma 7**.**
If the number of steps of the discretized random walk satisfies for some constant , then:
[TABLE]
where and is the time the continuous process hits the barrier .
Proof.
Let be the barrier of interest. Since depends on value of , as in Sections 5.1 and 5.3, we will work with probabilities conditioned on the event .
Note that (a) ; therefore, with probability at least , and . Also, (b) the probability that , for some constant , conditioned on , is at least 0.999. This is because the density function of conditioned on is given by:
[TABLE]
Therefore,
[TABLE]
for appropriately chosen . In particular is sufficiently small.
If is the time that the process hits the barrier , let the index denote the step in the discretized random walk that immediately follows . The value of this step is . Intuitively, this value should be very close to if the number of steps is sufficiently large. Indeed, we will prove the lemma by showing that if the number of steps in the discrete random walk satisfies , then
[TABLE]
Suppose that is conditioned on reaching the barrier at time and that is restricted to satisfying . We use basic properties of the distribution of the increments of a Brownian Bridge (see [Cha01] for details) to show that the value of a Brownian motion at time , under the condition that and , has the following distribution:
[TABLE]
Note that is the index of the closest step in the discretization to and that . If and , then Equation (24) implies that is distributed with mean at least and variance at most . Thus, if ,
[TABLE]
The Lemma follows. ∎
Lemma 7 can thus be applied to prove Theorem 1.
7 Correlated walks
To prove an approximation ratio of our rounding algorithm, we need to show that the positions of and (corresponding to the constraint ), determined by the random walks and , are close to the required distance if the vectors and are close. In other words, without loss of generality, let us assume that for a fixed constraint, we have . Then our goal is to show that the distance between the two positions assigned by our rounding procedure to and are close if the vectors and are close. After extensive computational investigation (on solutions obeying the constraints of ), we believe the following conjecture holds.
Conjecture 1**.**
In our rounding scheme, the expected distance between and is bounded above by if both and each have exactly one extreme sign change.
Proving the above conjecture would lead to an approximation guarantee slightly below , because we do not have an extreme sign change with probability 1.
We can show that if and have a small angle, then the two walks are (globally) close to each other in the sense that the area between the two walks is small. However, this does not immediately lead to a proof that the positions of their extreme sign changes are close.
Lemma 8**.**
Given two unit vectors and with angle , and a vector with each coordinate drawn from , then,
[TABLE]
Proof.
Let and . Let .
[TABLE]
The expected value of given that it is non-negative is . Since is always non-negative for from 0 to , the above statement follows by linearity of expectation. ∎
If we consider the random walks on the interval (i.e. we map the interval to the smaller interval ), then the expected area between the two walks is . Thus, as the contribution to the objective function increases, the two walks converge and the positions assigned to them by the rounding procedure should converge to one another.
Acknowledgements
We would like to thank Martin Becker and Larry Shepp for helpful discussions about Brownian motion. Most of this work was done in 2007 at the Max-Planck-Institut für Informatik in Saarbrücken, Germany.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[Ban 10] Nikhil Bansal. Constructive algorithms for discrepancy minimization. In Proceedings of 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS) , pages 3–10, 2010.
- 2[Cha 01] Joe Chang. Brownian motion. Lecture notes for Statistics 251/551, Yale University , 2001.
- 3[GHM + 11] Venkatesan Guruswami, Johan Håstad, Rajsekar Manokaran, Prasad Raghavendra, and Moses Charikar. Beating the random ordering is hard: Every ordering CSP is approximation resistant. SIAM Journal on Computing , 40(3):878–914, 2011.
- 4[KS 88] Ioannis Karatzas and Steven E. Shreve. Brownian Motion and Stochastic Calculus . Springer-Verlag, New York, 1988.
- 5[MN 11] Konstantin Makarychev and Alantha Newman. Complex semidefinite programming revisited and the assembly of circular genomes. In Innovations in Computer Science (ICS) , pages 444–459, 2011.
- 6[Sin 11] Amit Singer. Angular synchronization by eigenvectors and semidefinite programming. Applied and computational harmonic analysis , 30(1):20, 2011.
