Deterministic & Adaptive Non-Submodular Maximization via the Primal Curvature
J. David Smith, My T. Thai

TL;DR
This paper introduces a new technique for analyzing the performance of greedy algorithms in maximizing non-submodular functions, extending classical guarantees to adaptive and stochastic settings.
Contribution
It presents a novel curvature-based method to bound greedy algorithm performance for non-submodular functions, including adaptive and stochastic cases.
Findings
Provides a curvature-based approximation ratio for non-submodular maximization
Extends classical ratios to adaptive greedy algorithms
Supports applications with incomplete data and uncertainty
Abstract
While greedy algorithms have long been observed to perform well on a wide variety of problems, up to now approximation ratios have only been known for their application to problems having submodular objective functions . Since many practical problems have non-submodular , there is a critical need to devise new techniques to bound the performance of greedy algorithms in the case of non-submodularity. Our primary contribution is the introduction of a novel technique for estimating the approximation ratio of the greedy algorithm for maximization of monotone non-decreasing functions based on the curvature of without relying on the submodularity constraint. We show that this technique reduces to the classical ratio for submodular functions. Furthermore, we develop an extension of this ratio to the adaptive greedy algorithm, which allows applications to non-submodular…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplexity and Algorithms in Graphs · Risk and Portfolio Optimization · Game Theory and Voting Systems
Deterministic & Adaptive Non-Submodular Maximization
via the Primal Curvature
J. David Smith
CISE DepartmentUniversity of FloridaGainesvilleFlorida32611
and
My T. Thai
CISE DepartmentUniversity of FloridaGainesvilleFlorida32611
Abstract.
While greedy algorithms have long been observed to perform well on a wide variety of problems, up to now approximation ratios have only been known for their application to problems having submodular objective functions . Since many practical problems have non-submodular , there is a critical need to devise new techniques to bound the performance of greedy algorithms in the case of non-submodularity.
Our primary contribution is the introduction of a novel technique for estimating the approximation ratio of the greedy algorithm for maximization of monotone non-decreasing functions based on the curvature of without relying on the submodularity constraint. We show that this technique reduces to the classical ratio for submodular functions. Furthermore, we develop an extension of this ratio to the adaptive greedy algorithm, which allows applications to non-submodular stochastic maximization problems. This notably extends support to applications modeling incomplete data with uncertainty.
††copyright: rightsretained
1. Introduction
It is well-known that greedy approximation algorithms perform remarkably well, especially when the traditional ratio of (Nemhauser et al., 1978) for maximization of submodular objective functions is considered. Over the four decades since the proof of this ratio, the use of greedy approximations has become widespread due to several factors. First, many interesting problems satisfy the property of submodularity, which states that the marginal gain of an element never increases. If this condition is satisfied, and the set of possible solutions can be phrased as a uniform matroid, then one of the highest general-purpose approximation ratios is available “for free” with the use of the greedy algorithm. Second, the greedy algorithm is exceptionally simple both to understand and to implement.
A concrete example of this is the Influence Maximization problem, to which the greedy algorithm was applied with great success – ultimately leading to an empirical demonstration that it performed near-optimally on real-world data (Li et al., 2017). Kempe et al. showed this problem to be submodular under a broad class of influence diffusion models known as Triggering Models (Kempe et al., ). This led to a number of techniques being developed to improve the efficiency of the sampling needed to construct the problem instance (see e.g. (Borgs et al., ; Tang et al., ; Nguyen et al., ) and references therein) while maintaining a ratio as a result of the greedy algorithm. This line of work ultimately led to a -approximation by taking advantage the dramatic advances in sampling efficiency to construct an IP that can be solved in reasonable time (Li et al., 2017). In testing this method, it was found that greedy solutions performed near-optimally – an unexpected result given the worst-case.
For non-submodular problems, no general approximation ratio for greedy algorithms is known. However, due to their simplicity they frequently see use as simple baselines for comparison. On the Robust Influence Maximization problem proposed by He & Kempe, the simple greedy method was used in this manner (He and Kempe, 2016). This problem consists of a non-submodular combination of Influence Maximization sub-problems and aims to address uncertainty in the diffusion model. Yet despite the non-submodularity of the problem, the greedy algorithm performed no worse than the bi-criteria approximation (He and Kempe, 2016).
Another recent example of this phenomena is the socialbot reconnaissance attack studied by Li et al. (Li et al., 2016). They consider a minimization problem that seeks to answer how long a bot must operate to extract a certain level of sensitive information, and find that the objective function is (adaptive) submodular only in a scenario where users disregard network topology. In this scenario, the corresponding maximization problem, Max-Crawling, has a ratio due to the work of Golovin & Krause (Golovin and Krause, 2011). However, this constraint does not align with observed user behaviors. They give a model based on the work of Boshmaf et al. (Boshmaf et al., ), who observed that the number of mutual friends with the bot strongly correlates with friending acceptance rate. Although this model is no longer adaptive submodular, the greedy algorithm still exhibited excellent performance. Thus we see that while submodularity is sufficient to imply good performance, it is is not necessary for the greedy algorithm to perform well.
This, in turn, leads us to ask: is there any tool to theoretically bound the performance of greedy maximization with non-submodularity? Unfortunately, this condition has seen little study. Wang et al. give a ratio for it in terms of the worst-case rate of change in marginal gain (the elemental curvature ) (Wang et al., 2014). This suffices to construct bounds for non-submodular greedy maximization, though for non-trivial problem sizes they quickly approach 0. We note, however, that the ratio still encodes strong assumptions about the worst case: that the global maximum rate of change can occur an arbitrary number of times.
Motivated by the unlikeliness of this scenario, our proposed bound instead works with an estimate of how much change can occur during the steps taken by the greedy algorithm.
The remainder of this paper is arranged as follows: First, we briefly cover the preliminary material needed for the proofs and define the class of problems to which they apply (Sec. 1.1). We next define the notion of curvature used and develop a proof of the ratio based on it, with an extension to adaptive greedy algorithms, and show it is equivalent to the traditional ratio for submodular objectives (Sec. 2), and conclude with a reflection on the contributions and a discussion of future work (Sec. 3).
Contributions.
- •
A technique for estimating the approximation ratio of greedy maximization of non-submodular monotone non-decreasing objectives on uniform matroids.
- •
An extension of this technique to adaptive greedy optimization, where future greedy steps depend on the success or failure of prior steps.
1.1. Background & Related Work
To understand both the state of the art and advancements of this work, we first briefly cover each constraint required by the classical ratio (Nemhauser et al., 1978).
1.1.1. Constraints on the Ratio
Uniform Matroids. A matroid defines the notion of dependencies between elements of a set, and are denoted by . is the set of independent subsets of the universe .111For a complete treatment on matroids and associated theory, see Oxley (Oxley, 1992). For our purposes, it will suffice to cover the semantic meaning of -uniform matroids, which is codified as follows:
- (1)
All subsets of a feasible solution must also be feasible solutions. 2. (2)
Every is a feasible solution and is maximal in the sense that no superset is feasible.
For general matroids, there exists a ratio for greedy maximization of submodular functions due to Fisher et al. (Fisher et al., ). This is a special case of their ratio for the intersection of matroids.
Submodularity. The submodularity condition states that given any subsets of a universe , the marginal gain of any does not increase as the cardinality increases:
[TABLE]
This formally encodes the idea of diminishing returns. Leskovec et al. exploited this property to show a data-dependent bound in terms of the marginal gain of the top- un-selected elements (Leskovec et al., ), which was generalized to the adaptive case (Golovin and Krause, 2011).
To the best of our knowledge, the only generally applicable relaxation of this constraint is the work of Wang et al. (Wang et al., 2014), who define a ratio in terms the elemental curvature of a function, which encodes the degree with which a function may break submodularity.
1.1.2. Alternate Problems & Algorithms
The ratio has shown surprising generality, with proofs that it holds for maximization of sequence functions (Zhang et al., 2016) (and references) and adaptive stochastic maximization of functions that are submodular in expectation (Golovin and Krause, 2011), among others. However, not all adjacent work relies on the same naïve greedy method. To obtain a bound on the relaxation of monotonicity, Buchbinder et al. (Buchbinder et al., ) proposed a “double-greedy” algorithm with a (deterministic) or (randomized) ratio. For maximization on an intersection of matroids, Lee et al. showed a , ratio for a local search method (Lee et al., ).
Vondrak et al. proposed a continuous greedy algorithm with a ratio for general matroids (Vondrák, 2010), where is the total curvature of the function. An augmentation of this method has been shown to obtain a -approximation for single matroids (Sviridenko et al., ), along with an analogue for supermodular minimization. We remark that, while it exhibits a better ratio, this comes with a corresponding increase in complexity of the algorithm.
1.1.3. Curvature-Based Ratios
Conforti & Cornuéjols (Conforti and Cornuéjols, 1984) introduced the idea of total curvature later used by Sviridenko et al. for their ratio.
Definition 1 (Total Curvature).
Given a monotone non-decreasing submodular function defined on a matroid , the total curvature of is
[TABLE]
Using this definition, they arrived at a approximation for general matroids, which reduces to for maximzation on uniform matroids. Recently, Wang et al. (Wang et al., 2014) extended this idea by introducing the elemental curvature of a function :
Definition 2 (Elemental Curvature).
The elemental curvature of a monotone non-decreasing function is defined as
[TABLE]
where .
While the resulting ratio (Theorem 1.1) is not as clean as that of prior work, this ratio is well-defined for non-submodular functions.
Theorem 1.1 (Wang et al. (Wang
et al., 2014)).
For a monotone non-decreasing function defined on a -uniform matroid , the greedy algorithm on maximizing produces a solution satisfying
[TABLE]
where is the greedy solution, is the optimal solution, and is the elemental curvature of .
Corollary 1.2 (Wang et al. (Wang
et al., 2014)).
When , the ratio given by Theorem 1.1 converges to as .
However, the ratios produced based on the elemental curvature rapidly converge to [math] for non-submodular functions. This behavior is shown in Figure 1. Even for , the ratio is effectively zero and therefore uninformative. In contrast, we show that our ratio produces significant bounds for two non-submodular functions, while still converging to the ratio for submodular functions.
2. A Ratio for Non-Submodular
In this section, we introduce a further extension to the notion of curvature: primal curvature. We derive a bound based on this, prove its equivalence to for submodular functions. Then, we extend the ratio to the adaptive case, which allows direct application to a number of problems modeled under incomplete knowledge. We adopt a problem definition similar to that of Wang et al. Specifically, our ratio applies to any problem that can be phrased as -Uniform Matroid Maximization.
Problem 1 (-Uniform Matroid Maximization).
Given a -uniform matroid and a monotone non-decreasing function , find
[TABLE]
2.1. Construction of the Ratio
As noted previously, the ratio given by elemental curvature rapidly converges to zero for non-submodular functions. We observe that this is due to the definition of encoding the worst-case potential, and address this limitation by introducing the primal curvature of a function. Our definition separates the notion of rate-of-change from the global perspective imposed by elemental curvature.222The term primal is adopted primarily to distinguish this definition from prior work.
Definition 3 (Primal Curvature).
The primal curvature of a set function is defined as
[TABLE]
The global maximum primal curvature is equivalent to the elemental curvature of a function.
This shift from global to local perspective allows focus on the patterns present in real-world problem instances rather than limiting our attention to the worst-case scenarios.
A key observation of Wang et al’s work is that the elemental curvature defines an upper bound on the change between and , for some , in terms of and the marginal gain at . The definition of primal curvature improves on this, giving an equivalence in terms of the total primal curvature .
Definition 4 (Total Primal Curvature).
The total primal curvature of between two sets with is
[TABLE]
where the ’s form an arbitrary ordering of and .
We note that can be interpreted as the total change in the marginal value of from point to point . The following lemma illustrates this, as well as providing a useful identity.
Lemma 2.1.
[TABLE]
Proof.
First, expand the product into its constituent terms:
[TABLE]
After cancelling, the statement immediately follows. ∎
From this identity, we gain one further insight: the order in which elements are considered in does not matter.
Corollary 2.2.
The product is order-independent.
Using this, we can prove an equivalence between the change in total benefit and the sum of marginal gains taken with respect to .
Lemma 2.3.
For a set function and a pair of sets ,
[TABLE]
where , is the marginal gain and .
Proof.
Let be an arbitrary labeling of . Then we have:
[TABLE]
By the identity given in Lemma 2.1, we can write
[TABLE]
Noting that . Thus, the statement is proven. ∎
With this lemma, we can now construct the ratio.
Theorem 2.4.
For a monotone non-decreasing function , the greedy algorithm on a -uniform matroid maximizing produces a solution satisfying
[TABLE]
where is the greedy solution, is the greedy solution for an identical problem if a -uniform supermatroid of is well-defined, is the optimal solution on , and is an estimator satisfying:
[TABLE]
where
Proof.
To begin, note that due to monotone non-decreasing. Then, by Lemma 2.3 we have:
[TABLE]
We observe that any ratio that requires knowing is of little practical value: if is known, we can simply compute . Therefore, we relax our assumptions in three key ways to go from Eqn. (2), which assumes that we know exactly, to Eqn. (1), which requires no knowledge of the optimal.
First, we partly remove the assumption on knowledge of by substituting with , where .
[TABLE]
Next, we apply the upper bound as defined above to both remove the remaining dependence on knowledge of and to eliminate the requirement of knowing .
[TABLE]
Then, rearranging terms we get
[TABLE]
where . Then, dividing through by and cross-multiplying, we get:
[TABLE]
∎
When compared to traditional approximation ratios, this ratio has several obvious differences. First, it has dependencies on both the greedy solution and an extension of it to elements. This is both a strength and fundamental limitation of Theorem 2.4: it takes into account how much the greedy solution has converged toward negligible marginal gains, but also inhibits general analysis over all potential problem instances. Further, it requires that the supermatroid be well-defined, though we remark that this is generally not a problem. In practice, most problems solved with greedy algorithms are -element solutions on -element spaces, with typically much less than .
2.2. Equivalence to the Ratio
We next show that under assumptions encoding the submodularity condition, the above is equivalent to the ratio as .
Lemma 2.5.
Given a satisfying , the greedy algorithm produces a -element solution satisfying
[TABLE]
Proof.
We begin with Eqn. (3):
[TABLE]
for each , where denotes the -element greedy solution. Substitute for . Multiplying both sides by and summing from to . The left-hand side becomes:
[TABLE]
To obtain the right-hand side, separate into the marginal gain terms to produce the following in the body of the summation:
[TABLE]
Summing this over and employing the identity of the geometric series, this reduces to on the right-hand side. Thus, we obtain the relation
[TABLE]
∎
Corollary 2.6.
For a submodular monotone non-decreasing function , the following relation holds as :
[TABLE]
Proof.
For a submodular function, the primal curvature of any two elements at any point satisfies by the definition of submodularity. Thus, we obtain directly that satisfies the requisite relation. Then, the limit of as is , leading directly to the statement above. ∎
Thus, we see that this ratio is a generalization of the classical approximation ratio that allows specialization of a ratio to the particular kind of problem instances being operated on. Further, the definition of total primal curvature illuminates why this ratio is capable of producing more useful bounds for non-submodular objectives than that of Wang et al: the values encode a product of values that may converge to a limit, depending on problem instance, while the bound uses which does not converge for any (a condition which is implied by non-submodularity).
2.3. The Adaptive Ratio
We conclude this section by extending this ratio to the adaptive case where the decision made at each greedy step takes into account the outcomes of previous decisions. Briefly: in an adaptive algorithm, at each step the algorithm has a partial realization consistent with the true realization (Golovin and Krause, 2011). After each step, this partial realization is updated with the outcome of that step to form . The method for deciding the steps to take is termed a policy, with the greedy algorithm encoded as the greedy policy.
This representation supports the study of algorithms that operate with incomplete information and gradual revelation of the data. The initial motivation was described in terms of placement of sensors that may fail, and this technique has seen further use in studying networks with incomplete topology (Li et al., 2016; Seeman and Singer, ), active learning under noise (Golovin et al., ), and distributed representative subset mining (Mirzasoleiman et al., ).
We generalize our ratio to this case by defining the adaptive primal curvature of a function in terms of the partial realizations.
Definition 5 (Adaptive Primal Curvature).
The primal curvature of an adaptive monotone non-decreasing function is
[TABLE]
where is the set of possible states of and is the conditional expected marginal gain (Golovin and Krause, 2011).
Definition 6 (Adaptive T.P.C.).
Let and represent the set of possible state sequences leading from to . Then the adaptive total primal curvature is
[TABLE]
This definition leads to the following theorem by similar arguments as Thm. 2.4. However, the operations within expectation require additional care.
Lemma 2.7.
[TABLE]
Proof.
Fix a sequence of length . Then, expanding the product we obtain
[TABLE]
If we take the expectation of this w.r.t. the possible sequences , we obtain the same ratio regardless of , and therefore the claim holds trivially. ∎
Corollary 2.8.
Suppose that . Then
[TABLE]
where is the partial realization resulting from application of the -element greedy policy, , , and is the next element that would be selected by the greedy policy.
Proof.
By Lemma 2.7,
[TABLE]
and thus the statement holds. ∎
Lemma 2.9.
[TABLE]
where is the -truncation of with , selects exactly elements, is the maximum over all possible realizations resulting from applying policy , and .
Proof.
By Corollary 2.8, we have
[TABLE]
where the first equality uses the definition and the second uses the definition of . ∎
Theorem 2.10.
Define . Then
[TABLE]
Proof.
By Lemma 2.9, we have
[TABLE]
Multiply both sides by and sum from to . We get that the left hand side reduces to
[TABLE]
and the right hand side reduces to by employing the identity for partial sums of a geometric series to find that each term of the outer sum has coefficient . Combining these, we directly obtain the statement of the theorem. ∎
3. Conclusion & Future Work
In this paper, we presented a method for estimating the approximation ratio of greedy maximization that works transparently for both submodular and non-submodular functions, in addition to a variant supporting adaptive greedy algorithms. This ratio reduces to at worst as for submodular functions, and is shown to provide performance bounds for non-submodular maximization.
While we have demonstrated the utility of our technique for understanding the performance of non-submodular maximization, there remains room for further development. Relaxations of the uniformity and monotonicity conditions have found widespread use for submodular functions, and we expect that relaxing them for this ratio would likewise be generally useful.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1)
- 2(2) Christian Borgs, Michael Brautbar, Jennifer Chayes, and Brendan Lucier. Maximizing Social Influence in Nearly Optimal Time. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms (2014) (SODA ’14) . SIAM.
- 3(3) Yazan Boshmaf, Ildar Muslukhov, Konstantin Beznosov, and Matei Ripeanu. The Socialbot Network: When Bots Socialize for Fame and Money. In Proceedings of the 27th Annual Computer Security Applications Conference (2011) (ACSAC ’11) . ACM, 93–102.
- 4(4) N. Buchbinder, M. Feldman, J. Naor, and R. Schwartz. A Tight Linear Time (1/2)-Approximation for Unconstrained Submodular Maximization. In 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science (FOCS) (2012-10).
- 5Conforti and Cornuéjols (1984) Michele Conforti and Gérard Cornuéjols. 1984. Submodular Set Functions, Matroids and the Greedy Algorithm: Tight Worst-Case Bounds and Some Generalizations of the Rado-Edmonds Theorem. Discrete Applied Mathematics 7 (1984).
- 6(6) Marshall L. Fisher, George L. Nemhauser, and Laurence A. Wolsey. An Analysis of Approximations for Maximizing Submodular Set functions—II. In Polyhedral Combinatorics . Springer, 73–87.
- 7Golovin and Krause (2011) Daniel Golovin and Andreas Krause. 2011. Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization. Journal of Artificial Intelligence Research 42 (2011), 427–486.
- 8(8) Daniel Golovin, Andreas Krause, and Debajyoti Ray. Near-Optimal Bayesian Active Learning with Noisy Observations. In Advances in Neural Information Processing Systems 23 , J. D. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel, and A. Culotta (Eds.). Curran Associates, Inc., 766–774.
