TL;DR
This paper introduces formal fairness definitions in influence maximization, proposes algorithms to ensure fair influence spread across groups, and demonstrates their effectiveness on real-world social network data.
Contribution
It provides a novel fairness framework for influence maximization and develops algorithms that balance influence spread with fairness constraints.
Findings
Fair algorithms significantly reduce disparities among groups.
Standard methods often neglect smaller groups, leading to unfair outcomes.
Proposed methods improve fairness without substantial utility loss.
Abstract
Influence maximization is a widely used model for information dissemination in social networks. Recent work has employed such interventions across a wide range of social problems, spanning public health, substance abuse, and international development (to name a few examples). A critical but understudied question is whether the benefits of such interventions are fairly distributed across different groups in the population; e.g., avoiding discrimination with respect to sensitive attributes such as race or gender. Drawing on legal and game-theoretic concepts, we introduce formal definitions of fairness in influence maximization. We provide an algorithmic framework to find solutions which satisfy fairness constraints, and in the process improve the state of the art for general multi-objective submodular maximization problems. Experimental results on real data from an HIV prevention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Group-Fairness in Influence Maximization
Alan Tsang1∗
Bryan Wilder2∗
Eric Rice2&Milind Tambe2
Yair Zick1 ∗Equal contribution
1Department of Computer Science, National University of Singapore
2Center for AI in Society, University of Southern California
[email protected], {bwilder, ericr, tambe}@usc.edu, [email protected]
Abstract
Influence maximization is a widely used model for information dissemination in social networks. Recent work has employed such interventions across a wide range of social problems, spanning public health, substance abuse, and international development (to name a few examples). A critical but understudied question is whether the benefits of such interventions are fairly distributed across different groups in the population; e.g., avoiding discrimination with respect to sensitive attributes such as race or gender. Drawing on legal and game-theoretic concepts, we introduce formal definitions of fairness in influence maximization. We provide an algorithmic framework to find solutions which satisfy fairness constraints, and in the process improve the state of the art for general multi-objective submodular maximization problems. Experimental results on real data from an HIV prevention intervention for homeless youth show that standard influence maximization techniques oftentimes neglect smaller groups which contribute less to overall utility, resulting in a disparity which our proposed algorithms substantially reduce.
1 Introduction
Influence maximization in social networks is a well-studied problem with applications in a broad range of domains. Consider, for example, a group of at-risk youth; outreach programs try to provide as many people as possible with useful information (e.g., HIV safety, or available health services). Since resources (e.g., social workers) are limited, it is not possible to personally reach every at-risk individual. It is thus important to target key community figures who are likely to spread vital information to others. Formally, individuals are nodes in a social network, and we would like to influence or activate as many of them as possible. This can be done by initially seeding nodes (where ). The seed nodes activate their neighbors with some probability, who activate their neighbors and so forth. Our goal is to identify seeds such that the maximal number of nodes is activated. This is the classic influence maximization problem Kempe et al. (2003), that has received much attention in the literature.
In recent years, the influence maximization framework has seen application to many social problems, such as HIV prevention for homeless youth Yadav et al. (2018); Wilder et al. (2018b), public health awareness Valente and Pumpuang (2007), financial inclusion Banerjee et al. (2013), and more. Frequently, small and marginalized groups within a larger community are those who benefit the most from attention and assistance. It is important, then, to ensure that the allocation of resources reflects and respects the diverse composition of our communities, and that each group receives a fair allocation of the community’s resources. For instance, in the HIV prevention domain we may wish to ensure that members of racial minorities or of LGBTQ identity are not disproportionately excluded; this is where our work comes in.
Our Contributions:
This paper introduces the problem of fair resource allocation in influence maximization. Our first contribution is to propose fairness concepts for influence maximization. We start with a maximin concept inspired by the legal notion of disparate impact; formally it requires us to maximize the minimum fraction of nodes within each group that are influenced. While intuitive and well-motivated, this definition suffers from shortcomings that lead us to introduce a second concept, diversity constraints. Roughly, diversity constraints guarantee that every group receives influence commensurate with its “demand”, i.e., what it could have generated on its own, based on a number of seeds proportional to its size. Here, to compute a group’s demand, we allow it a number of seeds proportional to its size, but require that it spreads influence using only nodes in the group. Hence, a small but well connected group may have a better claim for influence than a large but sparsely connected group.
Our second contribution is an algorithmic framework for finding solutions that satisfy either fairness concept. While the classical influence maximization problem is submodular (and hence easily solved with a greedy algorithm), fairness considerations produce strongly non-submodular objectives. This renders standard techniques inapplicable. We show that both fairness concepts can be reduced to multi-objective submodular optimization problems, which are substantially more complex. Our key algorithmic contribution is a new method for general multi-objective submodular optimization which has substantially better approximation guarantee than the current best algorithm Udwani (2018), and often better runtime as well. This result may be of independent interest.
Our third contribution is an analytical exploration of the price of group fairness in influence maximization, i.e., the reduction in social welfare with respect to the unconstrained influence maximization problem due to imposing a fairness concept. We show that the price of diversity can be high in general for both concepts and under a range of settings.
Our fourth contribution is an empirical study on real-world social networks that have been used for a socially critical application: HIV prevention for homeless youth. Our results show that standard influence maximization techniques often cause substantial fairness violations by neglecting small groups. Our proposed algorithm substantially reduces such violations at relatively small cost to overall utility.
Related Work:
Kempe et al. (2003) introduced influence maximization and proved that since the objective is submodular, greedily selecting nodes gives a -optimal solution. There has since been substantial interest among the AI community both in developing more scalable algorithms (see Li et al. (2018) for a recent survey) , as well as in addressing the challenges of deployment in public health settings Yadav et al. (2016); Wilder et al. (2018a). Recently, such algorithms have been used in real-world pilot tests for HIV prevention amongst homeless youth Yadav et al. (2018); Wilder et al. (2018b), driving home the need to consider fairness as influence maximization is applied in socially sensitive domains. To our knowledge, no previous work considers fairness specifically for influence maximization. The techniques we introduce to optimize fairness metrics are related to research on multi-objective submodular maximization (outside the context of fairness), and we improve existing theoretical guarantees for this general problem Chekuri et al. (2010); Udwani (2018).
Outside of influence maximization, the general idea of diversity as an optimization constraint has received considerable attention in recent years; it has been studied in multiwinner elections (see Bredereck et al. (2018); Faliszewski et al. (2017) for an overview), resource allocation Benabbou et al. (2018); Aghaei et al. (2019), and matching problems Ahmed et al. (2017); Hamada et al. (2017). We note that some of the above works (e.g. Ahmed et al. (2017) and Schumann et al. (2017)) use a submodular objective function as a means of achieving diversity; interestingly, while the classic influence maximization target function is submodular, it is no longer so under diversity constraints. Group fairness has been studied extensively in the voting theory literature, where the objective is to identify a committee of candidates that will satisfy subsets of voters (see a comprehensive overview in Faliszewski et al. (2017)). There have also been several works on group fairness in fair division, defining notions of group envy-freeness Conitzer et al. (2019); Fain et al. (2018); Segal-Halevi and Suksompong (2018); Todo et al. (2011), and a group maximin share guarantee Barman et al. (2019); Suksompong (2018).
2 Model
Agents are embedded in a social network . An edge represents the ability for agent to influence or activate . may be undirected or directed.
Diversity:
Each agent in our network may identify with one or more groups within the larger population. These represent different ethnicities, genders, sexual orientations, or other groups for which fair treatment is important. Our goal is to maximize influence in a way such that each group receives at least a “fair” share of influence (more on this below). Let us designate these groups as . Each group represents a non-empty subset of V, . Each agent must belong to at least one group, but may belong to multiple groups; i.e. . In particular, this allows for the expression of intersectionality, where an individual may be part of several minority groups.
Influence maximization:
We model influence using the independent cascade model Kempe et al. (2003), the most common model in the literature. All nodes begin in the inactive state. The decision maker then selects seed nodes to activate. Each node that is activated makes one attempt to activate each of its inactive neighbors; each attempt succeeds independently with probability . Newly activated nodes attempt to activate their neighbors and so on, with the process terminating once there are no new activations.
We define the influence of nodes , denoted , as the expected number of nodes activated by seeding . Of these, let be the expected number of activated vertices from . Traditional influence maximization seeks a set , , maximizing . Using a slight abuse of notation, let be the maximum influence that can be achieved by selected seed nodes. That is, . Analogously, we define as the maximum expected number of vertices from that can be activated by seeds. We now propose two means of capturing group fairness in influence maximization.
Maximin Fairness:
Maximin Fairness captures the straightforward goal of improving the conditions for the least well-off groups. That is, we want to maximize the minimum influence received by any of the groups, as proportional to their population. This leads to the following utility function:
[TABLE]
Subject to this maximin constraint, we seek to maximize overall influence. Thus, we define with . That is, is the expected number of nodes activated by a seed configuration that maximizes the minimum proportional influence received by any group. This corresponds to the legal concept of disparate impact, which roughly states that a group has been unfairly treated if their “success rate” under a policy is substantially worse than other groups (see Barocas and Selbst (2016) for an overview). Therefore, maximin fairness may be significant to governmental or community organizations which are constrained to avoid this form of disparity. However, optimizing for equality of outcomes may be undesirable when some groups are simply much better suited than others to a network intervention. For instance, if one group is very poorly connected, maximin fairness would require that large number of nodes be spent trying to reach this group, even though additional seeds have relatively small impact.
Diversity Constraints:
We now propose an alternate fairness concept by extending the notion of individual rationality to Group Rationality. The key idea is that no group should be better off by leaving the (influence maximization) game with their proportional allocation of resources and allocating them internally. For each group , let be the number of seeds that would be fairly allocated to the group based on the group’s size within the larger population, rounded up to remove any doubt that this group receives a fair share. is the fair allocation of seeds to the group.
Let be the subgraph induced from by the nodes . This represents the network formed by group if they were to separate from the original network. Now, we define the group rational influence that each group can expect to receive as the number of nodes they expect to activate if they left the network, with their fair allocation of seeds. We denote this group rational influence for as . Then, we devise a set of diversity constraints that any group rational seeding configuration with seeds must satisfy: . That is, the influence received by each group is at least equal to what each group may accomplish on its own when given its fair share of seed nodes.
The diversity constraint objective function is to maximize the expected number of nodes activated, subject to the above diversity constraint. The utility for selecting seed nodes is:
[TABLE]
The maximum expected influence obtained via a group rational seeding configuration is called the rational influence , where .
Price of Fairness:
To measure the cost of ensuring a fair outcome for the diverse population, we will measure the Price of Fairness, the ratio of optimal influence to the best achievable influence under our two fairness criteria. Here optimal influence , which is the maximum amount of expected influence that can be obtained using any choice of seed nodes. We omit the subscript where the context is clear.
[TABLE]
3 Optimization
The standard approach to influence maximization is based on submodularity. Formally, a set function on ground set is submodular if for every and , . This captures the intuition that additional seeds provide diminishing returns. However, both of our fairness concepts are easily shown to violate this property (proofs are deferred to the appendix):
Theorem 3.1**.**
* and are not submodular.*
Hence, we cannot apply the greedy heuristic to group-fair influence maximization. However, we now show that optimizing either utility function reduces to multiobjective submodular maximization, for which we we give an improved algorithm below. Consider the following generic problem: given monotone submodular functions and corresponding target values , find a set satisfying with for all (under the promise that such an exists). Roughly, will be group ’s utility, and will be the utility that we want to guarantee for . Suppose that we have an algorithm for the above multiobjective problem. Then, we can optimize the maximin objective by letting and binary searching for the largest such that is feasible for all groups . For diversity constraints, we let and set the target . We then add another objective function representing the combined utility and binary search for the highest value such that the targets are feasible. This represents the largest achievable total utility, subject to diversity constraints. Having reduced both fairness concepts to multiobjective submodular maximization, we now give an improved algorithm for this core problem.
The multiobjective submodular problem was introduced by Chekuri et al. Chekuri et al. (2010), who gave an algorithm which guarantees for all provided that the number of objectives is smaller than the budget (when , the problem is provably inapproximable Krause et al. (2008)). Unfortunately, this algorithm is of mostly theoretical interest since it runs in time . Udwani Udwani (2018) recently introduced a practically efficient algorithm; however it obtains an asymptotic -approximation instead of the optimal . We remedy this gap by providing a practical algorithm obtaining an asymptotic -approximation (Algorithm 1). Its runtime is comparable to, and under many conditions faster than, the algorithm of Udwani (2018). We present the high-level idea behind the algorithm here, with additional details present in the appendix.
Previous algorithms Chekuri et al. (2010); Udwani (2018) start from a common template in submodular optimization, which we also build on. The main idea is to relax the discrete problem to a continuous space. For a given submodular function , its multilinear extension is defined on -dimensional vectors where for all . represents the probability that item is included in the set. Formally, let denote a set which includes each independently with probability . Then, we define , which can be evaluated using random samples.
The main challenge is to solve the continuous optimization problem, which is where our technical contribution lies. Algorithm 1 describes the high-level procedure, which runs our continuous optimization subroutine (line 2) and then rounds the output to a discrete set (line 3). Line 1, which ensures that all items with value above a threshold are included in the solution, is a technical detail needed to ensure the rounding succeeds. The rounding process captured in lines 1 and 3 is fairly standard and used by both previous algorithms Chekuri et al. (2010); Udwani (2018). Our main novelty lies in an improved algorithm for the continuous problem, MultiFW.
MultiFW implements a Frank-Wolfe style algorithm to simultaneously optimize the multilinear extensions of the discrete objectives. The algorithm proceeds over iterations. Each iteration first identifies , a good feasible point in continuous space (Algorithm 2, line 3). Then, the current solution is updated to add (line 4). The final output is an approximate decomposition of into integral points, produced using the algorithm of Mirrokni et al. (2017). This is a technical step required for the rounding procedure.
The key challenge is to efficiently find a that makes sufficient progress towards every objective simultaneously. We accomplish this by introducing the subroutine S-SP-MD (lines 6-12), which runs a carefully constructed version of stochastic saddle-point mirror descent Nemirovski et al. (2009). The idea is to find a for which is large enough for all objectives . We convert this into the saddle point problem of maximizing . denotes the set of objectives where (i.e., those where we still need to make progress). We let denote the set of all distributions over . Our approach only requires stochastic gradients, a necessary feature since computing exactly may be intractable when the objective itself is randomized (as in influence maximization).
Specifically, we assume access to two gradient oracles. First, a stochastic gradient oracle for each multilinear extension . Given a point , satisfies . Second, a stochastic gradient oracle corresponding to each item (in influence maximization, the items are the potential seed nodes). satisfies . We assume that for some constant . Linear-time oracles are available for many common submodular maximization problems (e.g., coverage functions and facility location Karimi et al. (2017)). Given such oracles, we implement a stochastic mirror descent algorithm for the maximin problem. We can interpret the algorithm as solving a game between the max player and the min player. The max player controls , while the min player controls a variable representing the weight put on each objective. Intuitively, the min player will put large weights where the max player is doing badly, forcing the max player to improve . Formally, in each iteration, the players take exponentiated gradient updates (lines 8-12). The max player obtains a stochastic gradient by sampling an objective with probability proportional to the current weights , while the min player samples an item proportional to and uses that item’s contribution to estimate the max player’s current performance on each objective. We prove that these updates converge rapidly to the optimal . With the subroutine in hand, our main algorithmic result is the following guarantee for Algorithm 1. Here, is the maximum value of a single item.
Theorem 3.2**.**
Given a feasible set of target values , Algorithm 1 outputs a set such that with probability at least . Asymptotically as , the approximation ratio can be set to approach so long as . The algorithm requires -accurate value oracle calls, -accurate value oracle calls, calls to and , and additional work.
This says that Algorithm 1 asymptotically converges to a -approximation when the budget is larger than the number of objectives (i.e., the conditions under which the problem is approximable). All terms in the approximation ratio are identical to Udwani Udwani (2018), except that we improve their factor to . The runtime is also identical apart from the time to solve the continuous problem (MultiFW vs their corresponding subroutine). This is difficult to compare since our respective algorithms use different oracles to access the functions. However, both kinds of oracles can typically be (approximately) implemented in time . Udwani’s algorithm uses oracle calls, while our’s requires . For large-scale problems, typically grows much faster than , , and (all of which are often constants, or near-so). Hence, trading runtime for can represent a substantial improvement. We present a more detailed discussion in the appendix.
To instantiate Algorithm 1 for influence maximization, we just need to supply appropriate stochastic gradient oracles. To our knowledge, no such oracles were previously known for influence maximization, which is substantially more complicated than other submodular problems because of additional randomness in the objective; naive extensions of previous methods require time. We provide efficient time stochastic gradient oracles by introducing a randomized method to simultaneously estimate many entries of the gradient at once (details may be found in the appendix).
4 Price of Fairness
In this section, we show that both definitions for the Price of Fairness can be unbounded; moreover, allowing nodes to join multiple groups can, counter-intuitively, worsen the PoF. The proofs in this section use undirected graphs. As they are more restrictive, the result naturally hold for directed graphs.
Theorem 4.1**.**
As and , .
Proof.
We construct a graph with two parts. In Part , we have vertices all disjoint except for two vertices; label one of these . In Part , we have a star with nodes. Label a leaf node and the central node . We define two groups: is comprised of the degree-1 vertices of , and for the remaining vertices, which includes the vertices of and the central vertex of the star. There are seeds, and since , they each have a fair allocation of seeds. The figure below illustrates this network.
We are interested in two seeding configurations: and . We can verify that configuration is fair. The activates nodes in Part , and in Part , for a total of .
Now consider configuration . receives influence, and since , does not receive its group rational share of influence. However, we can verify that this seeding is optimal. Part receives influence, and Part receives . Therefore, .
We may then calculate our Price of Fairness:
[TABLE]
And if we take the limit as , , . Finally, as as , . ∎
The appendix details a similar result for Maximin Fairness:
Theorem 4.2**.**
* is unbounded.*
Frequently, an individual may identify with multiple groups. Intuitively, we might expect such multi-group membership to improve the influence received by different groups and make the group-fairness easier to achieve (see the appendix for an example). However, in this section, we show that this is not always true, and giving even a single node membership in a second group can cause the Price of Fairness to worsen by an arbitrarily large amount.
Theorem 4.3**.**
Given graphs with groups and , and with groups and , where , and is obtained from by the addition of one vertex (, ). It is possible for .
Proof.
Consider a graph with two components: one component contains 2 vertices joint by an edge, the other component is a star with vertices (). There are two groups: contains all degree-1 vertices from and one vertex from ; contains the other vertex from and the central vertex from . There is one seed (), and the fair allocation of seeds to each group is .
Since the induced subgraphs for both groups comprise only of isolated nodes, the group rational influence for each group is . Therefore, the seed set is both fair and optimal, giving an expected influence of .
Now, let us modify by letting belong to both communities to obtain , and communities and . The group rational influence for remains the same (its members have not changed) but has increased to (by seeding ). In fact, this forces the fair allocation to seed instead of , for a fair influence of .
As , . ∎
A more technical construction can demonstrate a similar result for Maximin Fairness, but only as ; that is, as approaches . The proof is provided in the appendix.
Theorem 4.4**.**
Given graphs with groups and , and with groups and , where and is obtained from by the addition of one vertex (, ). It is possible for .
5 Experimental results
We now investigate the empirical impact of considering fairness in influence maximization. We start with experiments on a set of four real-world social networks which have been previously used for a socially critical application: HIV prevention for homeless youth. Each network has 60-70 nodes, and represents the real-world social connections between a set of homeless youth surveyed in a major US city. Each node in the network is associated with demographic information: their birth sex, gender identity, race, and sexual orientation. Each demographic attribute gives a partition of the network into anywhere from 2 to 6 different groups. For each partition, we compare three algorithms: the standard greedy algorithm for influence maximization, which maximizes the total expected influence (Greedy), Algorithm 1 used to enforce diversity constraints (DC), and Algorithm 1 used to find a maximin fair solution (Maximin). We set the propagation probability to be and fixed seeds (varying these parameters had little impact). We average over 30 runs of the algorithms on each network (since all of the algorithms use random simulations of influence propagation), with error bars giving bootstrapped 95% confidence intervals.
Figure 1 (top) shows that the choice of solution concept has a substantial impact on the results. For the diversity constraints case, we summarize the performance of each algorithm by the mean percentage violation of the constraints over all groups. For the maximin case, we directly report the minimum fraction influenced over all groups. We see that greedy generates substantial unfairness according to either metric: it generates the highest violations of diversity constraints, and has the smallest minimum fraction influenced. Greedy actually obtains near-zero maximin value with respect to sexual orientation. This results from it assigning one seed to a minority group in a single run and zero in others.
DC performs well across the board: it reduces constraint violations by approximately 55-65% while also performing competitively with respect to the maximin metric (even without explicitly optimizing for it). As expected, the Maximin algorithm generally obtains the best maximin value. DC actually attains slightly better maximin value for one attribute (birthsex); however, the difference is within the confidence intervals and reflects slight fluctuations in the approximation quality of the algorithms. However, Maximin performs surprisingly poorly with respect to diversity constraint violations. This indicates that optimizing exclusively for equal influence spread may force the algorithm to focus on poorly connected groups which exhibit severe diminishing returns. DC is able to attain almost as much influence in such groups but is then permitted to focus its remaining budget for higher impact. Interestingly, the price of fairness is relatively small for both solution concepts, in the range 1.05-1.15 (though it is higher for maximin than for DC). This indicates that while standard influence maximization techniques can introduce substantial fairness violations, mitigating such violations may be substantially less costly in real world networks than the theoretical worst case would suggest.
Finally, the rightmost plot in the top row of Figure 1 explores an example with overlapping groups. Specifically, we consider the race and birthsex attributes so that each node belongs to two groups. Constraint violations are somewhat higher than for either attribute individually, but the price of fairness remains small (1.07 for DC and 1.13 for Maximin).
In Figure 1 (bottom), we examine 20 synthetic networks used by Wilder et al. Wilder et al. (2018c) to model an obesity prevention intervention in the Antelope Valley region of California. Each node in the network has a geographic region, ethnicity, age, and gender, and nodes are more likely to connect to those with similar attributes. Each network has nodes and we set . Overall the results are similar to the homeless youth networks. One exception is the high price of fairness that maximin suffers with respect to the “region” attribute (over 1.4), but the other values are relatively low (below 1.2). We also observe that greedy obtains the (slightly) best maximin performance for gender, likely because the network is sufficiently well-mixed across genders that fairness is not a significant concern (as confirmed by the extremely low DC violations). Absent true fairness concerns, greedy may perform slightly better since it solves a simpler optimization problem. However, in the last figure, we examine overlapping groups given by region and ethnicity and observe that greedy actually obtains zero maximin value, indicating that there is one group that it never reached across any run.
6 Conclusions
In this paper, we examine the problem of selecting key figures in a population to ensure the fair spread of vital information across all groups. This problem modifies the classic influence maximization problem with additional fairness provisions based on legal and game theoretic concepts. We examine two methods for determining these provisions, and show that the “Price of Fairness” for these provisions can be unbounded. We propose an improved algorithm for multiobjective maximization to examine this problem on real world data sets. We show that standard influence maximization techniques often neglect smaller groups, and a diversity constraint based algorithm can ensure these groups receive a fair allocation of resources at relatively little cost. As automated techniques become increasingly prevalent in society and governance, our technique will help ensure that small and marginalized groups are fairly treated.
7 Appendix
Appendix A Price of fairness
Theorem 4.2.
* is unbounded.*
Proof.
Consider a graph with two components: which consists of 2 connected vertices, and which is a star with nodes. Let the first group have only one node in . All remaining nodes belong to the second group , including one node in and the central node of the star . We have seed.
It is clear that the optimal seeding configuration is to seed , which gives . However, this is not a maximin fair seeding, as receives 0 influence. Instead, seeding is maximin fair, giving influence and influence, giving a maximin utility . In this case, .
As , becomes unboundedly large. ∎
Theorem 3.1.
* and are not submodular.*
We divide the proof of this theorem into two parts:
Conjecture A**.**
Maximin utility is not submodular.
Proof.
Let us consider a graph with 4 nodes where form community and form community . Let and be two possible seeding configurations.
Notice that receives influence in both configurations, which is weakly less than the influence received by , and so, .
Now, consider adding to the and . since remains incompletely seeded. But since both groups are fully seeded. ∎
Conjecture B**.**
Group rational utility is not submodular.
Proof.
Recall the definition of group rational utility:
[TABLE]
Let us consider the same graph as in Conjecture A with 4 nodes where form community , and forms community . seeds are available, and so therefore the group rational constraints are only satisfied by seeding all vertices.
Let and . It is easy to verify that since none of these satisfy all group rational constraints. However, , and so therefore for , which contradicts the definition of submodularity.
∎
Theorem 4.4.
Give graphs with groups and , and with groups and , where and is obtained from by the addition of one vertex (, . It is possible for .
Proof.
Consider a graph with two star components: with vertices with a central node , and with vertices with central node (). There are two groups: contains 2 vertices, and a non-central node from ; contains remaining vertices, including . There is one seed (), and a total of nodes.
It is easy to see that the Maximin configuration is to seed , which gives influence, and influence. This gives a Maximin influence .111We do not need to calculate explicitly at any point in this proof as it is not required for the proof to work.
Now, consider a modified graph , but with our groups modified by allowing to belong to both communities. That is, and . The Maximin configuration has two possibilities: either remains the Maximin configuration, or becomes the new Maximin configuration. In order for the latter case to be true, seeding must provide higher proportional influence to the least well-off group than seeding .
Seeding generates influence for , and influence for . Seeding generates influence for , and . It can be shown that for and , these conditions are satisfied and is the Maximin configuration, generating a total of .
Then,
[TABLE]
And therefore, as , i.e. approaches from the left, the addition of a node to a second group may cause the Price of Maximin Fairness to worsen by an arbitrarily large amount. ∎
Appendix B Analysis of multiobjective submodular maximization problem
Consider a collection of monotone submodular functions with corresponding multilinear extensions . We will assume that the maximum singleton value of any item in the ground set is bounded as for all . Suppose that we are given a target value for each and would like to find a set with which guarantees for all . We are promised that such an exists. We will give an approximation algorithm for this problem which improves in terms of both runtime and approximation ratio on the best current algorithms, given by Udwani Udwani [2018], who in turn build on the work of Chekuri et al. Chekuri et al. [2010].
Our algorithm follows the overall template of Udwani [2018], which carries out three steps (given a precision level ).
Make a pass over the ground set, maintaining a set . Add to every item which has value at least for some . 2. 2.
Define to be the uniform matroid polytope for budget . Use a subroutine to find a point satisfying for all and some approximation ratio . This is the key step where we improve the runtime and approximation ratio. 3. 3.
Round to a set using the swap rounding algorithm of Chekuri et al. [2010] Output .
Our primary technical contribution is an algorithm for the second step which guarantees . It uses access to three kinds of stochastic oracles for the functions and their multilinear extensions:
A stochastic value oracle for singletons corresponding to each . Given an item , this oracle returns a value with and . 2. 2.
A stochastic gradient oracle for each multilinear extension . Given a point , satisfies and 3. 3.
A stochastic gradient oracle corresponding to each item . Given a point , satisfies and . Note that this can be simulated from the above oracle, but may sometimes admit more efficient implementations.
We now analyze this algorithm. We start by recalling a technical lemma on the smoothness of the multilinear extension:
Lemma A** (Hassani et al. Hassani et al. [2017], Lemma C.1).**
For any monotone submodular set function and its multilinear extension , where . That is, is -smooth with respect to the norm.
Lemma B**.**
* is -Lipschitz in the norm.*
Proof.
Recall that CITE, where denotes including each in independently with probability . By submodularity, . Hence, which proves the lemma. ∎
Next, we show a guarantee for the output of mirror descent in step 2(a).
Lemma C**.**
For some , suppose that there exists a such that for all . Then, S-SP-MD returns a satisfying for all with probability . There are iterations, each requiring one call to oracles and for some and , and additional work.
Proof.
Our objective is to find a satisfying , under the guarantee that such a exists. Note that we call S-SP-MD only on the set of indices where . For all other indices, where the current solution is already within of the target, monotonicity of the guarantees that .
The feasibility problem on the groups in is equivalent to solving maxmin problem
[TABLE]
To see this, let denote the optimal value for the maxmin problem; we are guaranteed . If we have with maxmin value at least , then satisfies
[TABLE]
We now prove that S-SP-MD produces a with maxmin value at least . Let be a matrix where column is for each , and define . Let be the -dimensional probability simplex. We would like to solve the problem
[TABLE]
which is easily seen to be equivalent to the original maxmin problem.
We will solve the above saddle point problem by running stochastic saddle point mirror descent with the negative entropy mirror map on the function . We obtain stochastic estimates of and via calls to input the oracles. First, note that
[TABLE]
where denotes drawing index with probability (recall that is a probability distribution). Hence, we can obtain an estimate of by sampling and returning . We are guaranteed . We take a similar strategy for : (since is a probability distribution). Hence, we can sample and return . This satisfies .
Note that we can bound the diameter of with respect to the mirror map by (see Hassani et al. [2017]) and the diameter of by (see Nemirovski et al. [2009]). We will run mirror descent for iterations. Let and . Now applying Proposition 3.2 of Nemirovski et al. Nemirovski et al. [2009] implies that after iterations we have
[TABLE]
and so taking ensures that
[TABLE]
holds with probability at least .
∎
Theorem D**.**
Suppose that there exists some satisfying for all . Then, after iterations, the algorithm returns a point satisfying for all . Each iteration requires one call to mirror descent at success probability and precision level , -accurate value oracle calls, and additional work.
Proof.
We analyze the progress that the algorithm makes with respect to each over a single step . Using the guarantee for the subroutine mirror descent (run with a precision level to be set below), and assuming that the values are feasible, we have with probability at least
[TABLE]
which implies
[TABLE]
and so after steps
[TABLE]
holds with probability at least via union bound. Taking , , and running mirror descent with success probability at each iteration ensures that
[TABLE]
holds for all with probability at least , which completes the guarantee for the solution quality. To obtain the bound on additional work done by the algorithm, we note that the only operation performed besides calling mirror descent is adding to the current iterate, which takes time . ∎
Theorem E**.**
Given a feasible set of target values , Algorithm 1 outputs a set such that with probability at least . Asymptotically as , the approximation ratio can be set to approach so long as . The algorithm requires -accurate value oracle calls, -accurate value oracle calls, calls to and , and additional work.
Proof.
ThresholdInclude produces a set for which each item satisfies for some , and any satisfies for all . Note that there can be at most items with for any given (combining submodularity with our WLOG assumption that is upper bounded by ). Hence, . Define .
Now we lower bound the marginal gain of the fractional vector returned by MultiobjectiveFW. So long as the target values are feasible, we are guaranteed that . for all . To see feasibility, let be the promised set satisfying the overall feasibility problem (i.e., for all ). Let denote the indicator vector of the set . We have that , and . Using Corollary 3 of Udwani [2018], the point satisfies . is also feasible for the continuous problem since . Now applying Theorem D guarantees that with probability at least .
Lastly, we need to handle the rounding process. We first take the point and approximately decompose it into a convex combination of integral points of . This is done using the algorithm of Mirrokni et al. Mirrokni et al. [2017], which produces a point satisfying along with a decomposition of into integral points of (Mirrokni et al. [2017], Proposition 5.1). If we run this algorithm with precision level , Lemma B guarantees that for all and hence . Applying Lemma 2 of Udwani [2018] (who summarize the guarantee for swap rounding proved by Chekuri et al. [2010]), carrying out iterations of swap rounding and taking the best outcome produces a set which satisfies with probability at least , provided that the best outcome is determined by calling a value oracle with precision level . Adding up the final guarantee, we have
[TABLE]
and now rescaling by a factor gives the final approximation guarantee. The asymptotic approximation follows by setting as in Udwani [2018].
We now add up the final runtime. The first thresholding step requires value oracle calls to each of the objectives at precision level . MultiobjectiveFW requires iterations, each of which calls mirror descent once. Each invocation of mirror descent requires a total of oracle calls. Recalling that , this is upper bounded by . Each iteration of MultiobjectiveFW also uses value oracle calls at precision level . Finally, each iteration uses additional overhead, for a total of . In the rounding procedure, we first need to involve ApproximateCaratheodory with precision level , which per Proposition 5.1 of Mirrokni et al. [2017] requires iterations, and one linear maximization over per iteration. Since is the uniform matroid polytope, each linear maximization takes time , and so this stage contributes time . Lastly, we have the iterations of swap rounding. Since was decomposed into integral points, swap rounding takes time for each iteration Chekuri et al. [2010]. We also need one -accurate value oracle call to each of the objective functions per iteration so that we can select the (approximately) best set. Combining these bounds results in the final stated runtime. ∎
Appendix C Efficient stochastic gradient estimates
We now give efficient implementations for the oracles and . They run in combined time time, where the operation succeeds with probability . Our implementations guarantee whenever they succeed.
We use a representation the influence maximization objective as the expectation over a set of deterministic submodular functions. Specifically, we can view the independent cascade model as specifying a distribution over live-edge graphs Kempe et al. [2003] where each edge is present with probability and absent otherwise (where all events are independent). Let denote a graph realized from this process, which we will denote . For a fixed , the influence spread of a given seed set is just the number of nodes which are reachable from via only the edges present in . We will denote this quantity by , where .
The starting point is to recall that for any group’s utility function , the gradients of the multilinear extension satisfy
[TABLE]
which follows from the definition of the multilinear extension Chekuri et al. [2010]. Note that for any fixed and , we can obtain a stochastic estimate of this quantity in time by first drawing a set , simulating the cascade process, and counting the number of of nodes reached with and without item . By submodularity, the resulting estimate satisfies for any and . Naively repeating this process over all would hence require time . We now show how to implement the required oracles by drawing a number of samples that scales only with instead of .
Implementing is simpler because we only need to estimate for a single fixed . Hence, we can draw a single , count the number of nodes reachable in each group under with set , and then count the number of nodes reachable with set . This takes time .
Efficiently implementing is more difficult since we need to simultaneously estimate with respect to every ; hence, naive enumeration would take time. We now detail our strategy. We start by considering a given sample and show how to estimate the marginal contribution for a given and and all in total runtime . We first remove all nodes from that are reachable from under , which takes time . Any node removed in this stage has marginal contribution 0. Next, we remove all nodes that are isolated in the remaining subgraph and assign them marginal contribution 1 if they are part of group . This stage takes time .
Now we deal with the remaining nodes. Here, determining their marginal contribution of node to group amounts to estimating the number of nodes of group which are reachable from in . We use the size estimation framework of Cohen Cohen [1997], which allows us to simultaneously produce an unbiased estimate of every remaining node’s contribution to group in time . We apply the weighted version of the estimator, where every node in group has weight 1 and all other nodes have weight 0. We take independent repetitions of the estimation process, resulting in runtime. For a given group , and using repetitions, Cohen’s estimator produces an estimate for each node which satisfies
2. 2.
for any
We fix as an arbitrary constant and use . This allows us to use union bound combined with the second property of the estimator to argue that over all nodes combined
[TABLE]
and so the resulting gradients will satisfy our stated bounds on and with high probability.
Our overall strategy is to generate enough samples that every node is missing from in at least one of them. Then, we can use a node’s marginal contribution in the sample from which it missing as its gradient estimate. Note that a node is absent from any given sample with probability . Given budget , at most nodes can have . For any such node, we can explicitly estimate a sample of Equation 1 using time per node, for total. For the remaining nodes, a simple argument shows that taking samples is sufficient to ensure that each node is missing from at least one sample with combined probability . Summing up, the total runtime to implement is .
Appendix D Runtime comparison with previous work
The best previous algorithm for multiobjective submodular maximization Udwani [2018] uses the same overall framework as us, but uses a MWU algorithm for the second stage (the continuous maximization problem). The MWU algorithm runs iterations, where each iteration requires a call to a greedy algorithm that maximizes a weighted combination of the . Using the best implementation of the greedy algorithm Badanidiyuru and Vondrák [2014]222While there are efficient special-purpose techniques for influence maximization on a given graph, it is not obvious how to adapt them to deal with the weighted combination of group objectives. requires value oracle calls, for such calls in total. By comparison, our algorithm accesses the function through calls to the gradient oracles and . It makes a number of calls to these oracles which is only logarithmic in , scaling as . Since gradient oracle calls can typically be implemented in similar asymptotic runtime to value oracle calls for common classes of functions (as we have demonstrated for influence maximization), our algorithm effectively saves a factor runtime in exchange for worse dependence on and . Since we expect to grow much faster than or (in many typical applications, is a small constant Hassani et al. [2017]), this is often an improvement in asymptotic runtime. For influence maximization in particular, it is easy to see that a value oracle call for a given group cannot be implemented in less than time, which matches (up to log factors) our stochastic gradient oracle’s dependence on the graph size.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Aghaei et al. [2019] S. Aghaei, M.J. Azizi, and P. Vayanos. Learning optimal and fair decision trees for non-discriminative decision-making. In Proc. of the 33rd AAAI , 2019.
- 2Ahmed et al. [2017] F. Ahmed, J. P. Dickerson, and M. Fuge. Diverse weighted bipartite b 𝑏 b -matching. In Proc. of the 26th IJCAI , pages 35–41, 2017.
- 3Badanidiyuru and Vondrák [2014] A. Badanidiyuru and J. Vondrák. Fast algorithms for maximizing submodular functions. In Proc. of the 25th SODA , pages 1497–1514, 2014.
- 4Banerjee et al. [2013] A. Banerjee, A. Chandrasekhar, E. Duflo, and M. O. Jackson. The diffusion of microfinance. Science , 341(6144), 2013.
- 5Barman et al. [2019] S. Barman, A. Biswas, S. K. Krishnamurthy, and Y. Narahari. Groupwise maximin fair allocation of indivisible goods. In Proc. of the 32nd AAAI , 2019.
- 6Barocas and Selbst [2016] S. Barocas and A. Selbst. Big data’s disparate impact. California Law Review , 104:671, 2016.
- 7Benabbou et al. [2018] N. Benabbou, M. Chakraborty, V. Ho, J. Sliwinski, and Y. Zick. Diversity constraints in public housing allocation. In Proc. of the 17th AAMAS , pages 973–981, 2018.
- 8Bredereck et al. [2018] R. Bredereck, P. Faliszewski, A. Igarashi, M. Lackner, and P. Skowron. Multiwinner elections with diversity constraints. In Proc. of the 32nd AAAI , pages 933–940, 2018.
