Randomized Bicriteria Approximation Algorithm for Minimum Submodular Cost Partial Multi-Cover Problem
Yishuo Shi, Zhao Zhang, Ding-Zhu Du

TL;DR
This paper introduces a randomized bicriteria approximation algorithm for the complex minimum submodular cost partial multi-cover problem, achieving near-complete coverage with provable performance bounds under certain conditions.
Contribution
It presents the first randomized bicriteria algorithm for SCPMC with guarantees, assuming constant maximum covering requirement and submodular cost functions.
Findings
Achieves $(q- ext{epsilon})$-coverage with high probability.
Performance ratio is $O(b/ ext{epsilon})$ where $b= ext{max}_e inom{f}{r_e}$.
Provides a bicriteria $O(f/ ext{epsilon})$-approximation for the case $r eq 1$.
Abstract
This paper studies randomized approximation algorithm for a variant of the set cover problem called minimum submodular cost partial multi-cover (SCPMC), in which each element has a covering requirement and a profit , and the cost function on sub-collection of sets is submodular, the goal is to find a minimum cost sub-collection of sets which fully covers at least -percentage of total profit, where an element is fully covered by sub-collection if and only if it belongs to at least sets of . Previous work shows that such a combination enormously increases the difficulty of studies, even when the cost function is linear. In this paper, assuming that the maximum covering requirement is a constant and the cost function is nonnegative, monotone nondecreasing, and submodular, we give the first randomized bicriteria algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Complexity and Algorithms in Graphs · Cryptography and Data Security
Randomized Bicriteria Approximation Algorithm for Minimum Submodular Cost Partial Multi-Cover Problem
Yishuo Shi1, Zhao Zhang2, Ding-Zhu Du3
1 College of Mathematics and System Sciences, Xinjiang University
Urumqi, Xinjiang, 830046, China.
2 College of Mathematics Physics and Information Engineering, Zhejiang Normal University
Jinhua, Zhejiang, 321004, China.
Department of Computer Science, University of Texas at Dallas
Richardson, Texas, 75080, USA. Corresponding Author: Zhao Zhang, [email protected].
Abstract
This paper studies randomized approximation algorithm for a variant of the set cover problem called minimum submodular cost partial multi-cover (SCPMC).
In a partial set cover problem, the goal is to find a minimum cost sub-collection of sets covering at least a required fraction of elements. In a multi-cover problem, each element has a covering requirement , and the goal is to find a minimum cost sub-collection of sets which fully covers all elements, where an element is fully covered by if belongs to at least sets of . In a minimum submodular cost set cover problem (SCSC), the cost function on sub-collection of sets is submodular and the goal is to find a set cover with the minimum cost.
The SCPMC problem studied in this paper is a combination of the above three problems, in which the cost function on sub-collection of sets is submodular and the goal is to find a minimum cost sub-collection of sets which fully covers at least -percentage of all elements. Previous work shows that such a combination enormously increases the difficulty of studies, even when the cost function is linear.
In this paper, assuming that the maximum covering requirement is a constant and the cost function is nonnegative, monotone nondecreasing, and submodular, we give the first randomized bicriteria algorithm for SCPMC the output of which fully covers at least -percentage of all elements and the performance ratio is with a high probability, where and is the maximum number of sets containing a common element. The algorithm is based on a novel non-linear program. Furthermore, in the case when the covering requirement , a bicriteria -approximation can be achieved even when monotonicity requirement is dropped off from the cost function.
Keywords: partial cover, multi-cover, submodular cover, Lovász extension, randomized algorithm, approximation algorithm, bicriteria.
1 Introduction
Set Cover is one of the most important combinatorial optimization problems in both the theoretical field and the application field, the goal of which is to find a sub-collection of sets with the minimum cost to cover all elements. There are a lot of variants of the set cover problem. The minimum partial set cover problem (PSC) is to find a minimum cost sub-collection of sets to cover at least -percentage of all elements. One motivation of PSC comes from the phenomenon that in a real world, “satisfying all requirements” will be too costly or even impossible, because of resource limitation or political policy. Another variant is the minimum multi-cover problem (MC), which comes from the requirement of fault tolerance in practice. In MC, each element has a covering requirement , and the goal is to find a minimum cost sub-collection to fully cover all elements, where element is fully covered by if belongs to at least sets of . Another generalization of set cover is submodular cost set cover (SCSC), in which the cost function on sub-collection of sets is submodular and the goal is to find a set cover with the minimum cost. Submodular functions have a natural diminishing returns property which finds wide applications in the real world, including economics, game theory, machine learning and computer vision, etc.
In this paper, we consider a problem which is a combination of the above three problems. In the minimum submodular cost partial multi-cover problem (SCPMC), each element has a profit as well as a covering requirement, the goal is to find a minimum submodular cost sub-collection of sets such that the profit of fully covered elements is at least a fixed percentage of the total profit.
1.1 Related Work
For Set Cover, Hochbaum [9] gave an –approximation algorithm based on LP rounding where is the maximum number of sets containing a common element. Khot and Regev [13] showed that the set cover problem cannot be approximated within for any constant assuming that unique games conjecture is true. Another classic result on Set Cover is that greedy strategy yields a -approximation [5, 11, 17], where is the maximum cardinality of a set. Dinur and Steurer [4] showed that the set cover problem cannot be approximated to unless , where is the size of ground set.
For MC, Dobson [6] gave an -approximation algorithm for the minimum multi-set multi-cover problem (MSMC), where is the maximum size of a multi-set and is the harmonic number (recall that ). Rajagopalan and Vazirani [18] gave a greedy algorithm achieving the same performance ratio, using dual fitting analysis. For the minimum set -cover problem in which the covering requirement of every element is , Berman et al. [2] gave a randomized algorithm achieving expected performance ratio at most .
For PSC, Kearns [12] gave the first greedy algorithm achieving performance ratio . Refining the greedy algorithm, Slavik [21] improved the ratio to , where is the desired covering ratio. Using primal dual method, Gandhi et al. [8] obtained an -approximation. Bar-Yehuda [1] studied a generalized version of the partial cover problem in which each element has a profit. Using local ratio method, he also obtained an -approximation. Proposing an Lagrangian relaxation framework, Konemann et al. [14] gave a -approximation for the generalized partial cover problem.
From the above related work, it can be seen that both PSC and MC admit performance ratios which match those best ratios for the classic set cover problem. However, combining partial cover with multi-cover seems to enormously increase the difficulty of studies. Ran et al. [19] were the first to study approximation algorithm for the minimum partial multi-cover problem (PMC). Using greedy strategy and a delicate dual fitting analysis, they gave a -approximation algorithm, where , , and , are the maximum and the minimum cost of set, , are the maximum and the minimum covering requirement of element, respectively. This ratio is meaningful only when the covering percentage is very close to 1. In [20], Ran et al. presented a simple greedy algorithm achieving performance ratio . Recall that in terms of , greedy algorithm for Set Cover achieves performance ratio . So, ratio for PMC is exponentially larger than the one for Set Cover. In the same paper, they also presented a local ratio algorithm which reveals an interesting “shock wave” phenomenon: their performance ratio is for both PSC (that is, when which is the partial single cover problem) and MC (that is, when which is the full multi-cover problem); however, when is smaller than 1 by a very small constant, the ratio jumps abruptly to .
The submodular cost set cover problem was first proposed by Iwata and Nagano [10]. They gave an -approximation algorithm for nonnegative submodular functions. In paper [15], Koufogiannakis and Young generalized set cover constraint to arbitrary covering constraints and gave an -approximation algorithm for monotone nondecreasing nonnegative submodular functions.
In this paper we combine submodular cost function with partial multi-cover constraint. As one can see from previous results on PMC, even when the cost function is linear, the partial multi-cover problem is already very difficult.
1.2 Our Contribution
The major contribution of this paper is a randomized -approximation algorithm for SCPMC, that is, the algorithm produces a solution covering at least -percentage of the total covering requirement, and achieves performance ratio with a high probability, where , and is the maximum number of sets containing a common element.
Before presenting this algorithm, we show that a natural integer program for SCPMC does not work since its integrality gap is arbitrarily large. Hence, to obtain a good approximation, we propose a novel integer program. The relaxation of the integer program uses Lovász extension [16]. Our algorithm consists of two stages of rounding. The first stage is a deterministic rounding. The second stage is a random rounding, the analysis of which is based on an equivalent expression of Lovász extension [3] in view of expectation.
As far as we know, this is the first approximation algorithm for a partial version of the submodular multi-cover problem. Furthermore, we show that for the special case when the covering requirement (the special case is abbreviated as SCPSC), our method can be adapted to yield an -approximation with high probability, even when monotonicity is dropped off from the requirement of the cost function.
This paper is organized as follows. In Section 2, we introduce formal definitions of problems considered in this paper, as well as some technical results. The bicriteria randomized algorithm for SCPMC is presented and analyzed in Section 3. In Section 4, we show how to adapt our algorithm to deal with SCPSC. The last section concludes the paper and discusses some future work.
2 Preliminaries
Definition 2.1** (Submodular Cost Partial Multi-Cover (SCPMC)).**
Suppose is an element set and is a collection of subsets of with ; each element has a positive covering requirement and a positive profit ; cost function is defined on sub-collections of , which is nonnegative, monotone nondecreasing, and submodular. Given a constant called covering ratio, the SCPMC problem is to find a minimum cost sub-collection such that , where is the total profit, means that is fully covered by , that is, . An instance of SCPMC is denoted as .**
In particular, when , we call the problem a submodular cost partial set cover problem (SCPSC). When the cost function is linear, that is, every set has a cost and the cost of a sub-collection is , the problem is exactly the minimum partial multi-cover problem (PMC).
Submodular function has many equivalent definitions. We only introduce the following one which is convenient to be used in this paper.
Definition 2.2** (submodular function).**
Given a ground set , a set function is submodular if for any and , we have
[TABLE]
Notice that a nonnegative submodular function satisfies subadditivity: for any sets ,
[TABLE]
Notice that a set can be indicated by its characteristic vector , where , , and if and if . So, in the following, we shall use notation to refer to a set function. The relationship between submodularity and convexity can be formulated in terms of Lovász extension.
Definition 2.3** (Lovász extension [16]).**
For a set function , the Lovász extension is defined as follows. For any vector , order elements as such that , where is the coordinate of indexed by . Let . The value of at is
[TABLE]
The above definition implies that Lovász extension satisfies positive homogenous property, that is, for any . The following result reveals the relationship between submodularity and convexity.
Theorem 2.4**.**
A set function is submodular if and only if its extension is convex.
The following is an equivalent expression of Lovász extension in range .
Theorem 2.5** ([3]).**
Let be a set function . The Lovász extension of in range can be equivalently expressed as
[TABLE]
where if , otherwise .**
In this paper, we study the SCPMC problem under the following assumptions.
(Assumption 1) The maximum covering requirement has a constant upper bound.
(Assumption 2) Since submodular cost (full) multi-cover problem is already studied in [10, 15], we only consider the partial version, that is, it is assumed that .
3 Approximation Algorithm for SCPMC
A natural idea to model the SCPMC problem is to use the following integer programm:
[TABLE]
Here indicates whether set is selected and indicates whether element is fully covered. The second constraint says that if then at least sets containing must be selected and thus is fully covered. Relaxing (3), we have the following convex program:
[TABLE]
However, based on such a program, one cannot find a good approximation. The following example shows that the integrality gap between (3) and (3) can be arbitrarily large, even when the profit function is a constant and the cost function is linear.
Example 3.1**.**
Let , with , , , , where is a large positive number, , , , and the cost function . Then , , form a feasible solution to (3) with objective value 2, while any integral feasible solution to (3) has cost at least .**
Hence, to obtain a good approximation, we need to find another program.
3.1 Integer Program and Convex Relaxation
For an element , an -cover is a sub-collection with such that for every . Denote by the family of all -covers and . The following example illustrates these concepts.
Example 3.2**.**
Let . with , , , , and , . For this example, , , and .**
Let : be the function on sub-families of defined by
[TABLE]
for . For example, . The SCPMC problem can be modeled as an integer program as follows.
[TABLE]
Here, indicates whether cover is selected and indicates whether element is fully covered. The second constraint says that if , then at least one -cover must be selected and thus is fully covered.
Example 3.3**.**
For the example in Example 3.2, suppose for and . Consider a feasible solution to (3.1): for , , and for all other , we have and . This feasible solution to (3.1) has objective value , which corresponds to a feasible solution to SCPMC with the same cost. Conversely, for the feasible solution to SCPMC, it is natural to set and all other to be zeros. However, this is not a feasible solution to (3.1). Nevertheless, one can construct a feasible solution to (3.1) having the same cost by setting and all other to be zeros.**
In general, for a feasible solution to SCPMC, one can construct a feasible solution to (3.1) as follows: for each element which is fully covered by , let and let for exactly one -cover which contains subsets of (such exists since is fully covered by ); all other variables are set to be zeros. Such a construction clearly results in a feasible solution to (3.1) whose objective value is at most (by the monotonicity of ). So, (3.1) is indeed a characterization of the SCPMC problem.
The following lemma shows that function is nonnegative, monotone nondecreasing, and submodular.
Lemma 3.4**.**
If is nonnegative, monotone nondecreasing, and submodular, then the function defined in (7) is also nonnegative, monotone nondecreasing, and submodular.
Proof.
The nonnegativity and the monotonicity are obvious. To prove the submodularity, by Definition 2.2, it is sufficient to show that for any and ,
[TABLE]
Denote and . Since , we have . Denote and . Then . Combining this with the observation that , we have
[TABLE]
It follows that
[TABLE]
where the first inequality uses submodularity of and (10), and the second inequality uses the monotonicity of and (10). Inequality (9), and thus the lemma, is proved. ∎
Remark 3.5**.**
If is nonnegative and submodular but is not monotone nondecreasing, then is not necessarily submodular. Consider the following example. Let with and for any other sub-collection . It can be verified that is nonnegative and submodular. Consider sub-families and , it can be calculated that
[TABLE]
So, is not submodular.**
Let be the Lovász extension of . By Theorem 2.4, is convex. Relaxing (3.1), we have the following convex program:
[TABLE]
Lemma 3.6**.**
Convex program (3.1) is polynomial-time solvable.
Proof.
It is known that (see [7]) for a submodular function , its Lov’asz extension for any , where is the convex closure of defined as follows. For each sub-family of , denote by as the indicator vector of . The convex closure of is the function : such that for any vector , . Hence can be rewritten as:
[TABLE]
Notice that this is a linear program. For each element , . Since in Assumption 1, we have assumed that is upper bounded by a constant, the number of variables in the form of or is polynomial. However, the number of variables in the form of is exponential.
Consider the dual program of (3.1):
[TABLE]
Since both and are polynomial, to solve (3.1), it suffices to construct a separation oracle for the first set of constraints.
Define for any . Since is obtained by subtracting a modular function from a submodular function, is also a submodular function. Hence, by finding a minimizer of , which can be done in polynomial time, and then check whether its -value is at least , we can either claim the validity of the first set of constraints or find out a violated constraint. ∎
Since (3.1) is a relaxation of (3.1), we have , where is the optimal value of (3.1) and is the optimal integer value of (3.1) (which is also the optimal value of SCPMC).
3.2 Rounding Algorithm
For a sub-collection , denote by the set of elements fully covered by . Two parameters are needed which are chosen in Theorem 3.11 to guarantee the desired ratio with high probability. The rounding algorithm consists of two phases. In the first phase, a deterministic rounding is executed to form a sub-collection . In the second phase, a randomized rounding is executed to form a sub-collection . The output is the union of and .
3.3 Approximation Analysis
Lemma 3.7**.**
For the collection of sets computed by Algorithm 1, . Furthermore, all elements with are fully covered by .
Proof.
Let be the vector defined after Line 6 of Algorithm 1, and let be the vector with for .
Recall that Lovász extension in Definition 2.3 requires an ordering of elements in a non-increasing manner. By the definition of and by the nonnegativity of , we can take the ordering of elements defining and to be the same and
[TABLE]
We claim that holds for any index . This is clearly true if . For an index with , we have (by Line 4 of Algorithm 1), which implies . The claim is proved. It follows that for any and for any index , (recall the notation defined in Theorem 2.5). Then, by the monotonicity of , we have
[TABLE]
Combining (14), (15) with the positive homogeneous property of Lovász extension,
[TABLE]
Next, consider the second half of the lemma. For each element with , by the second constraint of (3.1), and by the observation that , we have
[TABLE]
Hence there is at least one -cover with value , and thus . That is, after the deterministic rounding, at least one -cover is chosen into , and thus is fully covered. ∎
Lemma 3.8**.**
For the collection of sets computed by Algorithm 1, the expected cost of satisfies .
Proof.
Observe that each of the second “for” loop of Algorithm 1 is in fact a realization of Lovász extension in Theorem 2.5 (one may refer to [3]). So the expectation of the cost of those sets in each iteration is . Since is the union of these sets, so after iterations, . ∎
In the following, when we say that element is fully covered by , it means that the remaining covering requirement of is satisfied by . Using such a convention, we denote by the set of elements fully covered by , and let . Notice that is in fact a random sub-collection, and thus is a random value. To be more strict, let be the random variable which takes value if is fully covered by , and takes value [math] otherwise. Then
[TABLE]
The next lemma gives an upper bound for the expected value of .
Lemma 3.9**.**
For the collection of sets computed by Algorithm 1, the expected profit of satisfies .
Proof.
Since and
[TABLE]
it suffices to prove that for each ,
[TABLE]
Notice that for each , . Since we have assumed , so . Then, proving (18) is equivalent to proving
[TABLE]
In a “for” loop with a uniformly randomly chosen , an -cover is chosen into if and only if . For an element , it is not fully covered by those sets chosen into in this “for” loop if and only if . This occurs with probability . Since (see (16)), we have
[TABLE]
So, after iterations,
[TABLE]
where the second inequality uses the fact that . Denote and . Notice that is a convex function and is a linear function. Furthermore, , . So in interval . Since for each , . So, . Property (19) is proved, and the lemma follows. ∎
Remark 3.10**.**
One may be wondering what if is larger than the profit of those remaining elements which are not fully covered by . This cannot happen because after the first stage of deterministic rounding, the total profits of remaining elements is . Since it is required that , we have .**
Now we will show that by choosing suitable parameters and , Algorithm 1 produces a feasible solution with performance ratio with high probability.
Theorem 3.11**.**
Setting and , Algorithm 1 produces a feasible solution to SCPMC with high probability whose cost is , where .
Proof.
Notice that for the above and , we have
The outline of the proof is as follows: we first show that the sum of the probabilities for the following two events is a constant strictly smaller than 1; then a feasible solution with desired performance ratio can be achieved with high probability by repeating Algorithm 1 times. The two events are:
, where ;
.
For event , using Markov inequality and Lemma 3.8, we have
[TABLE]
For event , since by Algorithm 1, by Remark 3.10, and by Lemma 3.9, using Markov inequality,
[TABLE]
Adding inequalities (20) and (21), the probability that either event occurs or event occurs is upper bounded by a constant which is strictly smaller that . Hence, by repeating Algorithm 1 times, with a high probability, and . Combining these with Lemma 3.7, with high probability, and
[TABLE]
where the firs inequality uses (2) and the constant in big O is. The theorem is proved. ∎
4 Approximation Algorithm for SCPSC
As a corollary of Theorem 3.11, the minimum submodular cost partial set cover problem (SCPSC for short, in which the covering requirement for each element is one) admits a bicriteria randomized -approximation. In the following, we show that an adaptation of our method can yield the same approximation for SCPSC even if the submodular function is non-monotone. The idea behind the adaptation is that in this case, a natural constraint is sufficient (we do not need to use the more complicated -covers), and thus a technique similar to that in [10] dealing with non-monotone submodular functions can be used.
The SCPSC problem can be modelled as the following integer program:
[TABLE]
Its relaxation is a convex program:
[TABLE]
Notice that since we can use as objective function here, the convexity follows directly from the submodularity of . While for program (3.1), its convexity is guaranteed by Lemma 3.4, which is no longer true if is non-monotone (see Remark 3.5).
Define a new function by . Then is a nonnegative monotone nondecreasing submodular function (see [10]). For any sub-collection , the value of can be determined in polynomial time by an algorithm for submodualr function minimization. Let be the minimizer, that is, and . It should be noticed that can fully cover all those elements which are fully covered by (since ).
Our algorithm for SCPSC is similar to Algorithm 1 with the following two differences. First, replace convex program (3.1) by (4). Second, having obtained , compute and replace by in the remaining part of Algorithm 1.
Notice that in the analysis, monotonicity is used only in Lemma 3.7. So, to obtain the desired result, we only need to prove the following lemma.
Lemma 4.1**.**
.
Proof.
Let be the indicator vector of . By the monotonicity of , the Lovász extension is also monotone nondecreasing. Hence it follows from that
[TABLE]
For any sub-collection , by Definition 2.3, . By the definition of Lovász extension in Definition 2.3, we have
[TABLE]
Combining (23), (24) with the positive homogeneous property of Lovász extension,
[TABLE]
The lemma is proved. ∎
From the above argument, we have the following result.
Theorem 4.2**.**
For any nonnegative submodular function, the problem has a bicriteria randomized -approximation with high probability.
5 Conclusion
By introducing a novel convex program describing the minimum submodular cost partial multi-cover problem (SCPMC), we give a randomized -approximation algorithm for SCPMC, where . Since PMC is a special case of SCPMC, the PMC problem also has a bicriteria randomized -approximation algorithm with a high probability. We show that in the case when the covering requirement for each element is one, monotonicity requirement can be dropped off from the cost function. It should be noticed that if we only care about an expected result, then we may obtain a randomized algorithm producing a sub-collection with and . This can be achieved by modifying in Line 8 of Algorithm 1 into .
One question is can one obtain the same result for SCPMC without monotonicity requirement? Another question is what if is not upper bounded by a constant?
Acknowledgements
This research is supported by NSFC (11531011, 61222201).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Bar-Yehuda R (2001) Using homogeneous weights for approximating the partial cover problem. Journal of Algorithms, 39: 137–144.
- 2[2] Berman P, Das Gupta B, Sontag E (2007) Randomized approximation algorithms for set multicover problems with applications to reverse engineering of protein and gene networks. Discrete Applied Mathematics, 155 (6-7): 733–749.
- 3[3] Chekuri C, Ene A (2011) Submodular cost allocation problem and applications. International Colloquium on Automata, Languages, and Programming, 354–366.
- 4[4] Dinur I, Steurer D (2014) Analytical approach to parallel repetition. STOC 2014, 624–633.
- 5[5] Chvatal V (1979) A greedy heuristic for the set covering problem, Mathematics of Operations Research 4(3): 233–235.
- 6[6] Dobson G (1982) Worst-case analysis of greedy heuristics for integer program with nonnegatice data. Mathematics of Operations Research 7: 515–531.
- 7[7] Dughmi S (2009) Submodular functions: extensions, distributions, and algorithms. A survey. Ar Xiv:0912.0322 [cs.DS], 2009.
- 8[8] Gandhi R, Khuller S, Srinivasan A (2004) Approximation algorithms for partial covering problems. Journal of Algorithms, 53(1): 55–84.
