Group-Fairness in Influence Maximization

Alan Tsang; Bryan Wilder; Eric Rice; Milind Tambe; Yair Zick

arXiv:1903.00967·cs.GT·March 27, 2019

Group-Fairness in Influence Maximization

Alan Tsang, Bryan Wilder, Eric Rice, Milind Tambe, Yair Zick

PDF

1 Repo

TL;DR

This paper introduces formal fairness definitions in influence maximization, proposes algorithms to ensure fair influence spread across groups, and demonstrates their effectiveness on real-world social network data.

Contribution

It provides a novel fairness framework for influence maximization and develops algorithms that balance influence spread with fairness constraints.

Findings

01

Fair algorithms significantly reduce disparities among groups.

02

Standard methods often neglect smaller groups, leading to unfair outcomes.

03

Proposed methods improve fairness without substantial utility loss.

Abstract

Influence maximization is a widely used model for information dissemination in social networks. Recent work has employed such interventions across a wide range of social problems, spanning public health, substance abuse, and international development (to name a few examples). A critical but understudied question is whether the benefits of such interventions are fairly distributed across different groups in the population; e.g., avoiding discrimination with respect to sensitive attributes such as race or gender. Drawing on legal and game-theoretic concepts, we introduce formal definitions of fairness in influence maximization. We provide an algorithmic framework to find solutions which satisfy fairness constraints, and in the process improve the state of the art for general multi-objective submodular maximization problems. Experimental results on real data from an HIV prevention…

Equations51

U^{Maximin} (A) = i min \frac{I _{G, C_{i}} ( A )}{∣ C _{i} ∣}

U^{Maximin} (A) = i min \frac{I _{G, C_{i}} ( A )}{∣ C _{i} ∣}

U^{Rational} (A) = {I_{G} (A), 0, if I_{G, C_{i}} (A) \geq I_{G [C_{i}]} (k_{i}), \forall i . otherwise .

U^{Rational} (A) = {I_{G} (A), 0, if I_{G, C_{i}} (A) \geq I_{G [C_{i}]} (k_{i}), \forall i . otherwise .

P o F^{Rational} = \frac{I ^{OPT}}{I ^{Rational}} P o F^{Maximin} = \frac{I ^{OPT}}{I ^{Maximin}}

P o F^{Rational} = \frac{I ^{OPT}}{I ^{Rational}} P o F^{Maximin} = \frac{I ^{OPT}}{I ^{Maximin}}

P o F^{Rational} = \frac{I _{G}^{OPT}}{I _{G}^{Rational}} = \frac{2 + p + p s}{2 + 2 p + ( s - 1 ) p ^{2}}

P o F^{Rational} = \frac{I _{G}^{OPT}}{I _{G}^{Rational}} = \frac{2 + p + p s}{2 + 2 p + ( s - 1 ) p ^{2}}

U^{Rational} (A) = {I_{G} (A), 0, i f co n s t r ain t s s a t i s f i e d o t h er w i se .

U^{Rational} (A) = {I_{G} (A), 0, i f co n s t r ain t s s a t i s f i e d o t h er w i se .

n \to \infty lim \frac{P o F ^{Maximin} ( G ^{'} )}{P o F ^{Maximin} ( G )}

n \to \infty lim \frac{P o F ^{Maximin} ( G ^{'} )}{P o F ^{Maximin} ( G )}

= n \to \infty lim \frac{1 + s p}{1 + p ( t + 1 )}

= n \to \infty lim \frac{1 + s p}{1 + p ( \frac{s}{1 - 3 p} + 1 )}

= 1 - 3 p

v \in P max i \in I min \frac{v \cdot \nabla F _{i} ( x )}{W _{i} - F _{i} ( x )}

v \in P max i \in I min \frac{v \cdot \nabla F _{i} ( x )}{W _{i} - F _{i} ( x )}

v \cdot \nabla F_{i} (x) \geq (1 - ϵ) (W_{i} - F_{i} (x)) \forall i \in I

v \cdot \nabla F_{i} (x) \geq (1 - ϵ) (W_{i} - F_{i} (x)) \forall i \in I

v \in P max y \in Δ (I) min g (v, y)

v \in P max y \in Δ (I) min g (v, y)

\nabla_{v} g (v, y)

\nabla_{v} g (v, y)

\displaystyle\Pr\Bigg{[}\max_{v\in\mathcal{P}}g(v,\bar{y})-\min_{y\in\Delta(\mathcal{I})}g(\bar{v},y)\geq

\displaystyle\Pr\Bigg{[}\max_{v\in\mathcal{P}}g(v,\bar{y})-\min_{y\in\Delta(\mathcal{I})}g(\bar{v},y)\geq

\displaystyle\frac{(8+2\Omega)\sqrt{5}\left(c_{\text{grad}}\sqrt{k\log n}+kc_{\text{item}}\sqrt{\log n}\right)}{\epsilon\sqrt{T}}\Bigg{]}\leq 2\exp(-\Omega)

y \in Δ (I) min g (\overset{v}{ˉ}, y) \geq v \in P max y \in Δ (I) min g (v, y) - ϵ .

y \in Δ (I) min g (\overset{v}{ˉ}, y) \geq v \in P max y \in Δ (I) min g (v, y) - ϵ .

F_{i} (x^{t}) - F_{i} (x^{t - 1})

F_{i} (x^{t}) - F_{i} (x^{t - 1})

\geq \frac{1}{T} [\nabla F_{i} (x^{t - 1}) \cdot v^{t}] - \frac{b}{2} x^{t} - x^{t - 1}_{1}^{2} (Lemma \ref lemma:smooth)

\geq \frac{1}{T} [\nabla F_{i} (x^{t - 1}) \cdot v^{t}] - \frac{b k ^{2}}{2 T ^{2}} (ℓ_{1} diameter of P)

\geq \frac{1}{T} ((1 - ϵ_{1}) (W_{i} - F_{i} (x^{t - 1})) - ϵ_{1}) - \frac{b k ^{2}}{2 T ^{2}} (Lemma \ref lemma:mirror)

W_{i} - F_{i} (x^{t}) \leq (1 - \frac{1 - ϵ _{1}}{T}) [W_{i} - F_{i} (x^{t - 1})] + \frac{ϵ _{1}}{T} + \frac{b k ^{2}}{2 T ^{2}}

W_{i} - F_{i} (x^{t}) \leq (1 - \frac{1 - ϵ _{1}}{T}) [W_{i} - F_{i} (x^{t - 1})] + \frac{ϵ _{1}}{T} + \frac{b k ^{2}}{2 T ^{2}}

W_{i} - F_{i} (x^{T})

W_{i} - F_{i} (x^{T})

\leq \frac{1}{e ^{1 - ϵ_{1}}} W_{i} + ϵ_{1} + \frac{b k ^{2}}{2 T}

F_{i} (x^{T})

F_{i} (x^{T})

\geq (1 - ϵ) (1 - \frac{1}{e}) W_{i} - ϵ

f (S)

f (S)

= f (S_{1}) + f (S_{2} ∣ S_{1})

\geq (1 - ϵ) \frac{k _{1}}{k} (1 - \frac{1}{e}) W_{1} - 3 ϵ

\geq (1 - ϵ) (1 - \frac{m}{k ϵ ^{3} ( 1 + ϵ ^{'} )}) (1 - \frac{1}{e}) W_{1} - 2 ϵ

\nabla_{x_{j}} F_{i}

\nabla_{x_{j}} F_{i}

= S \sim x, ξ \sim P E [f_{i} (S \cup {j}, ξ) - f_{i} (S ∖ {j}, ξ)],

Pr [Δ (v) \geq 2 b] \leq Pr [Δ (v) \geq 2 f_{i} ({v} ∣ S)] \leq δ

Pr [Δ (v) \geq 2 b] \leq Pr [Δ (v) \geq 2 f_{i} ({v} ∣ S)] \leq δ

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bwilder0/fair_influmax_code_release
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Group-Fairness in Influence Maximization

Alan Tsang1∗

Bryan Wilder2∗

Eric Rice2&Milind Tambe2

Yair Zick1 ∗Equal contribution

1Department of Computer Science, National University of Singapore

2Center for AI in Society, University of Southern California

[email protected], {bwilder, ericr, tambe}@usc.edu, [email protected]

Abstract

Influence maximization is a widely used model for information dissemination in social networks. Recent work has employed such interventions across a wide range of social problems, spanning public health, substance abuse, and international development (to name a few examples). A critical but understudied question is whether the benefits of such interventions are fairly distributed across different groups in the population; e.g., avoiding discrimination with respect to sensitive attributes such as race or gender. Drawing on legal and game-theoretic concepts, we introduce formal definitions of fairness in influence maximization. We provide an algorithmic framework to find solutions which satisfy fairness constraints, and in the process improve the state of the art for general multi-objective submodular maximization problems. Experimental results on real data from an HIV prevention intervention for homeless youth show that standard influence maximization techniques oftentimes neglect smaller groups which contribute less to overall utility, resulting in a disparity which our proposed algorithms substantially reduce.

1 Introduction

Influence maximization in social networks is a well-studied problem with applications in a broad range of domains. Consider, for example, a group of at-risk youth; outreach programs try to provide as many people as possible with useful information (e.g., HIV safety, or available health services). Since resources (e.g., social workers) are limited, it is not possible to personally reach every at-risk individual. It is thus important to target key community figures who are likely to spread vital information to others. Formally, individuals are nodes $V$ in a social network, and we would like to influence or activate as many of them as possible. This can be done by initially seeding $k$ nodes (where $k\ll|V|$ ). The seed nodes activate their neighbors with some probability, who activate their neighbors and so forth. Our goal is to identify $k$ seeds such that the maximal number of nodes is activated. This is the classic influence maximization problem Kempe et al. (2003), that has received much attention in the literature.

In recent years, the influence maximization framework has seen application to many social problems, such as HIV prevention for homeless youth Yadav et al. (2018); Wilder et al. (2018b), public health awareness Valente and Pumpuang (2007), financial inclusion Banerjee et al. (2013), and more. Frequently, small and marginalized groups within a larger community are those who benefit the most from attention and assistance. It is important, then, to ensure that the allocation of resources reflects and respects the diverse composition of our communities, and that each group receives a fair allocation of the community’s resources. For instance, in the HIV prevention domain we may wish to ensure that members of racial minorities or of LGBTQ identity are not disproportionately excluded; this is where our work comes in.

Our Contributions:

This paper introduces the problem of fair resource allocation in influence maximization. Our first contribution is to propose fairness concepts for influence maximization. We start with a maximin concept inspired by the legal notion of disparate impact; formally it requires us to maximize the minimum fraction of nodes within each group that are influenced. While intuitive and well-motivated, this definition suffers from shortcomings that lead us to introduce a second concept, diversity constraints. Roughly, diversity constraints guarantee that every group receives influence commensurate with its “demand”, i.e., what it could have generated on its own, based on a number of seeds proportional to its size. Here, to compute a group’s demand, we allow it a number of seeds proportional to its size, but require that it spreads influence using only nodes in the group. Hence, a small but well connected group may have a better claim for influence than a large but sparsely connected group.

Our second contribution is an algorithmic framework for finding solutions that satisfy either fairness concept. While the classical influence maximization problem is submodular (and hence easily solved with a greedy algorithm), fairness considerations produce strongly non-submodular objectives. This renders standard techniques inapplicable. We show that both fairness concepts can be reduced to multi-objective submodular optimization problems, which are substantially more complex. Our key algorithmic contribution is a new method for general multi-objective submodular optimization which has substantially better approximation guarantee than the current best algorithm Udwani (2018), and often better runtime as well. This result may be of independent interest.

Our third contribution is an analytical exploration of the price of group fairness in influence maximization, i.e., the reduction in social welfare with respect to the unconstrained influence maximization problem due to imposing a fairness concept. We show that the price of diversity can be high in general for both concepts and under a range of settings.

Our fourth contribution is an empirical study on real-world social networks that have been used for a socially critical application: HIV prevention for homeless youth. Our results show that standard influence maximization techniques often cause substantial fairness violations by neglecting small groups. Our proposed algorithm substantially reduces such violations at relatively small cost to overall utility.

Related Work:

Kempe et al. (2003) introduced influence maximization and proved that since the objective is submodular, greedily selecting nodes gives a $\left(1-\frac{1}{\mathrm{e}}\right)$ -optimal solution. There has since been substantial interest among the AI community both in developing more scalable algorithms (see Li et al. (2018) for a recent survey) , as well as in addressing the challenges of deployment in public health settings Yadav et al. (2016); Wilder et al. (2018a). Recently, such algorithms have been used in real-world pilot tests for HIV prevention amongst homeless youth Yadav et al. (2018); Wilder et al. (2018b), driving home the need to consider fairness as influence maximization is applied in socially sensitive domains. To our knowledge, no previous work considers fairness specifically for influence maximization. The techniques we introduce to optimize fairness metrics are related to research on multi-objective submodular maximization (outside the context of fairness), and we improve existing theoretical guarantees for this general problem Chekuri et al. (2010); Udwani (2018).

Outside of influence maximization, the general idea of diversity as an optimization constraint has received considerable attention in recent years; it has been studied in multiwinner elections (see Bredereck et al. (2018); Faliszewski et al. (2017) for an overview), resource allocation Benabbou et al. (2018); Aghaei et al. (2019), and matching problems Ahmed et al. (2017); Hamada et al. (2017). We note that some of the above works (e.g. Ahmed et al. (2017) and Schumann et al. (2017)) use a submodular objective function as a means of achieving diversity; interestingly, while the classic influence maximization target function is submodular, it is no longer so under diversity constraints. Group fairness has been studied extensively in the voting theory literature, where the objective is to identify a committee of $k$ candidates that will satisfy subsets of voters (see a comprehensive overview in Faliszewski et al. (2017)). There have also been several works on group fairness in fair division, defining notions of group envy-freeness Conitzer et al. (2019); Fain et al. (2018); Segal-Halevi and Suksompong (2018); Todo et al. (2011), and a group maximin share guarantee Barman et al. (2019); Suksompong (2018).

2 Model

Agents are embedded in a social network $G=(V,E)$ . An edge $(i,j)\in E$ represents the ability for agent $v_{i}$ to influence or activate $v_{j}$ . $G$ may be undirected or directed.

Diversity:

Each agent in our network may identify with one or more groups within the larger population. These represent different ethnicities, genders, sexual orientations, or other groups for which fair treatment is important. Our goal is to maximize influence in a way such that each group receives at least a “fair” share of influence (more on this below). Let us designate these groups as $\mathcal{C}=\{C_{1},\ldots C_{m}\}$ . Each group $C_{i}$ represents a non-empty subset of V, $\emptyset\neq C_{i}\subseteq V$ . Each agent must belong to at least one group, but may belong to multiple groups; i.e. $C_{1}\cup C_{2}\cup\ldots C_{m}=V$ . In particular, this allows for the expression of intersectionality, where an individual may be part of several minority groups.

Influence maximization:

We model influence using the independent cascade model Kempe et al. (2003), the most common model in the literature. All nodes begin in the inactive state. The decision maker then selects $k$ seed nodes to activate. Each node that is activated makes one attempt to activate each of its inactive neighbors; each attempt succeeds independently with probability $p$ . Newly activated nodes attempt to activate their neighbors and so on, with the process terminating once there are no new activations.

We define the influence of nodes $A\subseteq V$ , denoted $\mathcal{I}_{G}(A)$ , as the expected number of nodes activated by seeding $A$ . Of these, let $\mathcal{I}_{G,C_{i}}(A)$ be the expected number of activated vertices from $C_{i}$ . Traditional influence maximization seeks a set $A$ , $|A|\leq k$ , maximizing $\mathcal{I}_{G}(A)$ . Using a slight abuse of notation, let $\mathcal{I}_{G}(k)$ be the maximum influence that can be achieved by selected $k$ seed nodes. That is, $\mathcal{I}_{G}(k)=\max_{|A|=k}\mathcal{I}_{G}(A)$ . Analogously, we define $\mathcal{I}_{G,C_{i}}(k)$ as the maximum expected number of vertices from $C_{i}$ that can be activated by $k$ seeds. We now propose two means of capturing group fairness in influence maximization.

Maximin Fairness:

Maximin Fairness captures the straightforward goal of improving the conditions for the least well-off groups. That is, we want to maximize the minimum influence received by any of the groups, as proportional to their population. This leads to the following utility function:

[TABLE]

Subject to this maximin constraint, we seek to maximize overall influence. Thus, we define $\mathcal{I}_{G}^{\mathrm{Maximin}}=\mathcal{I}_{G}(B)$ with $B=\operatorname*{argmax}_{A\subseteq V,|A|=k}U^{\mathrm{Maximin}}(A)$ . That is, $\mathcal{I}_{G}^{\mathrm{Maximin}}$ is the expected number of nodes activated by a seed configuration that maximizes the minimum proportional influence received by any group. This corresponds to the legal concept of disparate impact, which roughly states that a group has been unfairly treated if their “success rate” under a policy is substantially worse than other groups (see Barocas and Selbst (2016) for an overview). Therefore, maximin fairness may be significant to governmental or community organizations which are constrained to avoid this form of disparity. However, optimizing for equality of outcomes may be undesirable when some groups are simply much better suited than others to a network intervention. For instance, if one group is very poorly connected, maximin fairness would require that large number of nodes be spent trying to reach this group, even though additional seeds have relatively small impact.

Diversity Constraints:

We now propose an alternate fairness concept by extending the notion of individual rationality to Group Rationality. The key idea is that no group should be better off by leaving the (influence maximization) game with their proportional allocation of resources and allocating them internally. For each group $C_{i}$ , let $k_{i}=\lceil k|C_{i}|/|V|\rceil$ be the number of seeds that would be fairly allocated to the group $C_{i}$ based on the group’s size within the larger population, rounded up to remove any doubt that this group receives a fair share. $k_{i}$ is the fair allocation of seeds to the group.

Let $G[C_{i}]$ be the subgraph induced from $G$ by the nodes $C_{i}$ . This represents the network formed by group $C_{i}$ if they were to separate from the original network. Now, we define the group rational influence that each group $C_{i}$ can expect to receive as the number of nodes they expect to activate if they left the network, with their fair allocation of $k_{i}$ seeds. We denote this group rational influence for $C_{i}$ as $\mathcal{I}_{G[C_{i}]}(k_{i})$ . Then, we devise a set of diversity constraints that any group rational seeding configuration $A$ with $k$ seeds must satisfy: $\mathcal{I}_{G,C_{i}}(A)\geq\mathcal{I}_{G[C_{i}]}(k_{i}),\forall i$ . That is, the influence received by each group is at least equal to what each group may accomplish on its own when given its fair share of $k_{i}$ seed nodes.

The diversity constraint objective function is to maximize the expected number of nodes activated, subject to the above diversity constraint. The utility for selecting seed nodes $A$ is:

[TABLE]

The maximum expected influence obtained via a group rational seeding configuration $A$ is called the rational influence $\mathcal{I}_{G}^{\mathrm{Rational}}=\mathcal{I}_{G}(B)$ , where $B=\operatorname*{argmax}_{A\subseteq V,|A|=k}U^{\mathrm{Rational}}(A)$ .

Price of Fairness:

To measure the cost of ensuring a fair outcome for the diverse population, we will measure the Price of Fairness, the ratio of optimal influence to the best achievable influence under our two fairness criteria. Here optimal influence $\mathcal{I}_{G}^{\mathrm{OPT}}=\mathcal{I}_{G}(k)$ , which is the maximum amount of expected influence that can be obtained using any choice of $k$ seed nodes. We omit the subscript where the context is clear.

[TABLE]

3 Optimization

The standard approach to influence maximization is based on submodularity. Formally, a set function $f$ on ground set $V$ is submodular if for every $A\subseteq B\subseteq V$ and $x\in V\setminus B$ , $f(A\cup\{x\})-f(A)\geq f(B\cup\{x\})-f(B)$ . This captures the intuition that additional seeds provide diminishing returns. However, both of our fairness concepts are easily shown to violate this property (proofs are deferred to the appendix):

Theorem 3.1.

$U^{\mathrm{Maximin}}$ * and $U^{\mathrm{Rational}}$ are not submodular.*

Hence, we cannot apply the greedy heuristic to group-fair influence maximization. However, we now show that optimizing either utility function reduces to multiobjective submodular maximization, for which we we give an improved algorithm below. Consider the following generic problem: given monotone submodular functions $f_{1}...f_{m}$ and corresponding target values $W_{1}...W_{m}$ , find a set $S$ satisfying $|S|\leq k$ with $f_{i}(S)\geq W_{i}$ for all $i$ (under the promise that such an $S$ exists). Roughly, $f_{i}$ will be group $i$ ’s utility, and $W_{i}$ will be the utility that we want to guarantee for $i$ . Suppose that we have an algorithm for the above multiobjective problem. Then, we can optimize the maximin objective by letting $f_{i}=\frac{\mathcal{I}_{G,C_{i}}}{|C_{i}|}$ and binary searching for the largest $W$ such that $f_{i}\geq W$ is feasible for all groups $i$ . For diversity constraints, we let $f_{i}=\mathcal{I}_{G,C_{i}}$ and set the target $W_{i}=\mathcal{I}_{G[C_{i}]}(k_{i})$ . We then add another objective function $f_{\text{total}}=\mathcal{I}_{G}$ representing the combined utility and binary search for the highest value $W_{\text{total}}$ such that the targets $W_{1}...W_{m},W_{\text{total}}$ are feasible. This represents the largest achievable total utility, subject to diversity constraints. Having reduced both fairness concepts to multiobjective submodular maximization, we now give an improved algorithm for this core problem.

The multiobjective submodular problem was introduced by Chekuri et al. Chekuri et al. (2010), who gave an algorithm which guarantees $f_{i}\geq(1-\frac{1}{e})W_{i}$ for all $i$ provided that the number of objectives $m$ is smaller than the budget $k$ (when $m=\Omega(k)$ , the problem is provably inapproximable Krause et al. (2008)). Unfortunately, this algorithm is of mostly theoretical interest since it runs in time $O(n^{8})$ . Udwani Udwani (2018) recently introduced a practically efficient algorithm; however it obtains an asymptotic $(1-\frac{1}{e})^{2}$ -approximation instead of the optimal $\left(1-\frac{1}{e}\right)$ . We remedy this gap by providing a practical algorithm obtaining an asymptotic $\left(1-\frac{1}{e}\right)$ -approximation (Algorithm 1). Its runtime is comparable to, and under many conditions faster than, the algorithm of Udwani (2018). We present the high-level idea behind the algorithm here, with additional details present in the appendix.

Previous algorithms Chekuri et al. (2010); Udwani (2018) start from a common template in submodular optimization, which we also build on. The main idea is to relax the discrete problem to a continuous space. For a given submodular function $f$ , its multilinear extension $F$ is defined on $n$ -dimensional vectors $x$ where $0\leq x_{j}\leq 1$ for all $j$ . $x_{j}$ represents the probability that item $j$ is included in the set. Formally, let $S\sim x$ denote a set which includes each $j$ independently with probability $x_{j}$ . Then, we define $F(x)=\operatorname*{\mathbb{E}}_{S\sim x}[f(S)]$ , which can be evaluated using random samples.

The main challenge is to solve the continuous optimization problem, which is where our technical contribution lies. Algorithm 1 describes the high-level procedure, which runs our continuous optimization subroutine (line 2) and then rounds the output to a discrete set (line 3). Line 1, which ensures that all items with value above a threshold $\tau$ are included in the solution, is a technical detail needed to ensure the rounding succeeds. The rounding process captured in lines 1 and 3 is fairly standard and used by both previous algorithms Chekuri et al. (2010); Udwani (2018). Our main novelty lies in an improved algorithm for the continuous problem, MultiFW.

MultiFW implements a Frank-Wolfe style algorithm to simultaneously optimize the multilinear extensions $F_{1}...F_{m}$ of the discrete objectives. The algorithm proceeds over $T$ iterations. Each iteration first identifies $v^{t}$ , a good feasible point in continuous space (Algorithm 2, line 3). Then, the current solution $x^{t}$ is updated to add $\frac{1}{T}v^{t}$ (line 4). The final output is an approximate decomposition of $x^{T}$ into integral points, produced using the algorithm of Mirrokni et al. (2017). This is a technical step required for the rounding procedure.

The key challenge is to efficiently find a $v^{t}$ that makes sufficient progress towards every objective simultaneously. We accomplish this by introducing the subroutine S-SP-MD (lines 6-12), which runs a carefully constructed version of stochastic saddle-point mirror descent Nemirovski et al. (2009). The idea is to find a $v$ for which $v\cdot\nabla_{i}F_{i}(x^{t-1})$ is large enough for all objectives $i$ . We convert this into the saddle point problem of maximizing $\min_{i\in\mathcal{I}}v\cdot\nabla_{i}F_{i}(x^{t-1})$ . $\mathcal{I}$ denotes the set of objectives $i$ where $W_{i}-F_{i}(x^{t-1})\geq\epsilon$ (i.e., those where we still need to make progress). We let $\Delta(\mathcal{I})$ denote the set of all distributions over $i$ . Our approach only requires stochastic gradients, a necessary feature since computing $\nabla_{i}F(x^{t-1})$ exactly may be intractable when the objective itself is randomized (as in influence maximization).

Specifically, we assume access to two gradient oracles. First, a stochastic gradient oracle $\mathcal{A}^{i}_{\text{grad}}$ for each multilinear extension $F_{i}$ . Given a point $x$ , $\mathcal{A}^{i}_{\text{grad}}(x)$ satisfies $\operatorname*{\mathbb{E}}[\mathcal{A}^{i}_{\text{grad}}]=\nabla_{x}F_{i}(x)$ . Second, a stochastic gradient oracle $\mathcal{A}^{j}_{\text{item}}$ corresponding to each item $j\in[n]$ (in influence maximization, the items are the potential seed nodes). $\mathcal{A}^{j}_{\text{item}}(x)$ satisfies $\operatorname*{\mathbb{E}}[\mathcal{A}^{j}_{\text{item}}(x)]=\left[\nabla_{x_{j}}F_{1}(x)...\nabla_{x_{j}}F_{m}(x)\right]$ . We assume that $||\mathcal{A}^{i}_{\text{grad}}(x)||_{\infty},||\mathcal{A}^{j}_{\text{item}}(x)||_{\infty}\leq c$ for some constant $c$ . Linear-time oracles are available for many common submodular maximization problems (e.g., coverage functions and facility location Karimi et al. (2017)). Given such oracles, we implement a stochastic mirror descent algorithm for the maximin problem. We can interpret the algorithm as solving a game between the max player and the min player. The max player controls $v$ , while the min player controls a variable $y$ representing the weight put on each objective. Intuitively, the min player will put large weights where the max player is doing badly, forcing the max player to improve $v$ . Formally, in each iteration, the players take exponentiated gradient updates (lines 8-12). The max player obtains a stochastic gradient by sampling an objective with probability proportional to the current weights $y$ , while the min player samples an item proportional to $v$ and uses that item’s contribution to estimate the max player’s current performance on each objective. We prove that these updates converge rapidly to the optimal $v$ . With the subroutine in hand, our main algorithmic result is the following guarantee for Algorithm 1. Here, $b=\max_{i,j}f_{i}(\{j\})$ is the maximum value of a single item.

Theorem 3.2.

Given a feasible set of target values $W_{1}...W_{n}$ , Algorithm 1 outputs a set $S$ such that $f_{i}(S)\geq(1-\epsilon)\left(1-\frac{m}{k(1+\epsilon^{\prime})\epsilon^{3}}\right)\left(1-\frac{1}{e}\right)W_{i}-\epsilon$ with probability at least $1-\delta$ . Asymptotically as $k\to\infty$ , the approximation ratio can be set to approach $1-1/e$ so long as $m=o(k\log^{3}k)$ . The algorithm requires $O(nm)$ $\epsilon^{\prime}$ -accurate value oracle calls, $O(m\frac{bk^{2}}{\epsilon}\log\frac{1}{\delta})$ $\epsilon$ -accurate value oracle calls, $O\left(\frac{bk^{4}c^{2}}{\epsilon^{5}}\log\left(n+\frac{bk}{\delta\epsilon}\right)\right)$ calls to $\mathcal{A}_{\text{grad}}$ and $\mathcal{A}_{\text{item}}$ , and $O\left(\frac{nk^{2}b^{2}}{\epsilon^{2}}+\frac{mk^{2}b}{\epsilon}+\frac{k^{3}b^{2}}{\epsilon^{2}}\right)$ additional work.

This says that Algorithm 1 asymptotically converges to a $\left(1-\frac{1}{e}\right)$ -approximation when the budget $k$ is larger than the number of objectives $m$ (i.e., the conditions under which the problem is approximable). All terms in the approximation ratio are identical to Udwani Udwani (2018), except that we improve their factor $\left(1-\frac{1}{e}\right)^{2}$ to $\left(1-\frac{1}{e}\right)$ . The runtime is also identical apart from the time to solve the continuous problem (MultiFW vs their corresponding subroutine). This is difficult to compare since our respective algorithms use different oracles to access the functions. However, both kinds of oracles can typically be (approximately) implemented in time $O(n)$ . Udwani’s algorithm uses $O(n)$ oracle calls, while our’s requires $O(bk^{4}c^{2}\log n)$ . For large-scale problems, $n$ typically grows much faster than $k$ , $b$ , and $c$ (all of which are often constants, or near-so). Hence, trading $O(n^{2})$ runtime for $O(n\log n)$ can represent a substantial improvement. We present a more detailed discussion in the appendix.

To instantiate Algorithm 1 for influence maximization, we just need to supply appropriate stochastic gradient oracles. To our knowledge, no such oracles were previously known for influence maximization, which is substantially more complicated than other submodular problems because of additional randomness in the objective; naive extensions of previous methods require $O(n^{2})$ time. We provide efficient $O(kn\log n)$ time stochastic gradient oracles by introducing a randomized method to simultaneously estimate many entries of the gradient at once (details may be found in the appendix).

4 Price of Fairness

In this section, we show that both definitions for the Price of Fairness can be unbounded; moreover, allowing nodes to join multiple groups can, counter-intuitively, worsen the PoF. The proofs in this section use undirected graphs. As they are more restrictive, the result naturally hold for directed graphs.

Theorem 4.1.

As $n\to\infty$ and $p\to 0$ , $PoF^{\mathrm{Rational}}\to\infty$ .

Proof.

We construct a graph $G$ with two parts. In Part $L$ , we have $s-1$ vertices all disjoint except for two vertices; label one of these $x3$ . In Part $S$ , we have a star with $s+1$ nodes. Label a leaf node $x_{1}$ and the central node $x_{2}$ . We define two groups: $C_{1}$ is comprised of the $s$ degree-1 vertices of $S$ , and $C_{2}$ for the remaining $s$ vertices, which includes the vertices of $L$ and the central vertex $x_{2}$ of the star. There are $k=2$ seeds, and since $|C_{1}|=|C_{2}|$ , they each have a fair allocation of $k_{1}=k_{2}=1$ seeds. The figure below illustrates this network.

We are interested in two seeding configurations: $A=\{x_{1},x_{3}\}$ and $B=\{x_{2},x_{3}\}$ . We can verify that configuration $A$ is fair. The $A$ activates $1+p$ nodes in Part $L$ , and $1+p+(s-1)p^{2}$ in Part $S$ , for a total of $\mathcal{I}_{G}(A)=2+2p+(s-1)p^{2}$ .

Now consider configuration $B$ . $C_{1}$ receives $ps$ influence, and since $p<\frac{2}{n}=\frac{1}{s}$ , $C_{1}$ does not receive its group rational share of influence. However, we can verify that this seeding is optimal. Part $L$ receives $(1+p)$ influence, and Part $S$ receives $1+ps$ . Therefore, $\mathcal{I}_{G}(B)=2+p+ps$ .

We may then calculate our Price of Fairness:

[TABLE]

And if we take the limit as $n\to\infty$ , $s\to\infty$ , $PoF\to 1/p$ . Finally, as as $p\to 0$ , $PoF\to\infty$ . ∎

The appendix details a similar result for Maximin Fairness:

Theorem 4.2.

$PoF^{\mathrm{Maximin}}$ * is unbounded.*

Frequently, an individual may identify with multiple groups. Intuitively, we might expect such multi-group membership to improve the influence received by different groups and make the group-fairness easier to achieve (see the appendix for an example). However, in this section, we show that this is not always true, and giving even a single node membership in a second group can cause the Price of Fairness to worsen by an arbitrarily large amount.

Theorem 4.3.

Given graphs $G$ with groups $C_{1}$ and $C_{2}$ , and $G^{\prime}$ with groups $C^{\prime}_{1}$ and $C^{\prime}_{2}$ , where $G^{\prime}=G$ , $C^{\prime}_{1}=C_{1}$ and $C^{\prime}_{2}$ is obtained from $C_{2}$ by the addition of one vertex $x_{1}$ ( $x_{1}\in C_{1}$ , $x_{1}\notin C_{2}$ ). It is possible for $\lim\limits_{n\to\infty}\frac{PoF^{\mathrm{Rational}}_{G^{\prime}}}{PoF^{\mathrm{Rational}}_{G}}=\infty$ .

Proof.

Consider a graph $G$ with two components: one component $K$ contains 2 vertices joint by an edge, the other component $S$ is a star with $s+1$ vertices ( $s\geq 1/p$ ). There are two groups: $C_{1}$ contains all degree-1 vertices from $S$ and one vertex from $K$ ; $C_{2}$ contains the other vertex $x_{1}$ from $K$ and the central vertex $x_{2}$ from $S$ . There is one seed ( $k=1$ ), and the fair allocation of seeds to each group is $k_{1}=k_{2}=1$ .

Since the induced subgraphs for both groups comprise only of isolated nodes, the group rational influence for each group is $\mathcal{I}_{G[C_{1}]}=\mathcal{I}_{G[C_{2}]}=1$ . Therefore, the seed set $\{x_{2}\}$ is both fair and optimal, giving an expected influence of $\mathcal{I}_{G}(\{x_{2}\})=1+ps$ .

Now, let us modify $G$ by letting $x_{1}$ belong to both communities to obtain $G^{\prime}$ , and communities $C^{\prime}_{1}$ and $C^{\prime}_{2}$ . The group rational influence for $C^{\prime}_{2}$ remains the same (its members have not changed) but $\mathcal{I}_{G^{\prime}[C^{\prime}_{1}]}$ has increased to $1+p$ (by seeding $x_{1}$ ). In fact, this forces the fair allocation to seed $x_{1}$ instead of $x_{2}$ , for a fair influence of $\mathcal{I}_{G^{\prime}}(\{x_{1}\})=1+p$ .

As $n\to\infty$ , $\lim\limits_{n\to\infty}\frac{PoF^{\mathrm{Rational}}_{G^{\prime}}}{PoF^{\mathrm{Rational}}_{G}}=\lim\limits_{s\to\infty}\frac{1+ps}{1+p}=\infty$ . ∎

A more technical construction can demonstrate a similar result for Maximin Fairness, but only as $p\to\frac{1}{3}^{-}$ ; that is, $p<\frac{1}{3}$ as $p$ approaches $\frac{1}{3}$ . The proof is provided in the appendix.

Theorem 4.4.

Given graphs $G$ with groups $C_{1}$ and $C_{2}$ , and $G^{\prime}$ with groups $C^{\prime}_{1}$ and $C^{\prime}_{2}$ , where $G^{\prime}=G$ $C^{\prime}_{1}=C_{1}$ and $C^{\prime}_{2}$ is obtained from $C_{2}$ by the addition of one vertex $x_{1}$ ( $x_{1}\in C_{1}$ , $x_{1}\notin C_{2}$ ). It is possible for $\lim\limits_{\begin{subarray}{c}n\to\infty\\ p\to\frac{1}{3}^{-}\end{subarray}}\frac{PoF^{\mathrm{Maximin}}_{G^{\prime}}}{PoF^{\mathrm{Maximin}}_{G}}\to\infty$ .

5 Experimental results

We now investigate the empirical impact of considering fairness in influence maximization. We start with experiments on a set of four real-world social networks which have been previously used for a socially critical application: HIV prevention for homeless youth. Each network has 60-70 nodes, and represents the real-world social connections between a set of homeless youth surveyed in a major US city. Each node in the network is associated with demographic information: their birth sex, gender identity, race, and sexual orientation. Each demographic attribute gives a partition of the network into anywhere from 2 to 6 different groups. For each partition, we compare three algorithms: the standard greedy algorithm for influence maximization, which maximizes the total expected influence (Greedy), Algorithm 1 used to enforce diversity constraints (DC), and Algorithm 1 used to find a maximin fair solution (Maximin). We set the propagation probability to be $p=0.1$ and fixed $k=15$ seeds (varying these parameters had little impact). We average over 30 runs of the algorithms on each network (since all of the algorithms use random simulations of influence propagation), with error bars giving bootstrapped 95% confidence intervals.

Figure 1 (top) shows that the choice of solution concept has a substantial impact on the results. For the diversity constraints case, we summarize the performance of each algorithm by the mean percentage violation of the constraints over all groups. For the maximin case, we directly report the minimum fraction influenced over all groups. We see that greedy generates substantial unfairness according to either metric: it generates the highest violations of diversity constraints, and has the smallest minimum fraction influenced. Greedy actually obtains near-zero maximin value with respect to sexual orientation. This results from it assigning one seed to a minority group in a single run and zero in others.

DC performs well across the board: it reduces constraint violations by approximately 55-65% while also performing competitively with respect to the maximin metric (even without explicitly optimizing for it). As expected, the Maximin algorithm generally obtains the best maximin value. DC actually attains slightly better maximin value for one attribute (birthsex); however, the difference is within the confidence intervals and reflects slight fluctuations in the approximation quality of the algorithms. However, Maximin performs surprisingly poorly with respect to diversity constraint violations. This indicates that optimizing exclusively for equal influence spread may force the algorithm to focus on poorly connected groups which exhibit severe diminishing returns. DC is able to attain almost as much influence in such groups but is then permitted to focus its remaining budget for higher impact. Interestingly, the price of fairness is relatively small for both solution concepts, in the range 1.05-1.15 (though it is higher for maximin than for DC). This indicates that while standard influence maximization techniques can introduce substantial fairness violations, mitigating such violations may be substantially less costly in real world networks than the theoretical worst case would suggest.

Finally, the rightmost plot in the top row of Figure 1 explores an example with overlapping groups. Specifically, we consider the race and birthsex attributes so that each node belongs to two groups. Constraint violations are somewhat higher than for either attribute individually, but the price of fairness remains small (1.07 for DC and 1.13 for Maximin).

In Figure 1 (bottom), we examine 20 synthetic networks used by Wilder et al. Wilder et al. (2018c) to model an obesity prevention intervention in the Antelope Valley region of California. Each node in the network has a geographic region, ethnicity, age, and gender, and nodes are more likely to connect to those with similar attributes. Each network has $500$ nodes and we set $k=25$ . Overall the results are similar to the homeless youth networks. One exception is the high price of fairness that maximin suffers with respect to the “region” attribute (over 1.4), but the other $PoF$ values are relatively low (below 1.2). We also observe that greedy obtains the (slightly) best maximin performance for gender, likely because the network is sufficiently well-mixed across genders that fairness is not a significant concern (as confirmed by the extremely low DC violations). Absent true fairness concerns, greedy may perform slightly better since it solves a simpler optimization problem. However, in the last figure, we examine overlapping groups given by region and ethnicity and observe that greedy actually obtains zero maximin value, indicating that there is one group that it never reached across any run.

6 Conclusions

In this paper, we examine the problem of selecting key figures in a population to ensure the fair spread of vital information across all groups. This problem modifies the classic influence maximization problem with additional fairness provisions based on legal and game theoretic concepts. We examine two methods for determining these provisions, and show that the “Price of Fairness” for these provisions can be unbounded. We propose an improved algorithm for multiobjective maximization to examine this problem on real world data sets. We show that standard influence maximization techniques often neglect smaller groups, and a diversity constraint based algorithm can ensure these groups receive a fair allocation of resources at relatively little cost. As automated techniques become increasingly prevalent in society and governance, our technique will help ensure that small and marginalized groups are fairly treated.

7 Appendix

Appendix A Price of fairness

Theorem 4.2.

$PoF^{M}$ * is unbounded.*

Proof.

Consider a graph $G$ with two components: $K$ which consists of 2 connected vertices, and $S$ which is a star with $s+1$ nodes. Let the first group $C_{1}$ have only one node in $K$ . All remaining nodes belong to the second group $C_{2}$ , including one node $x_{1}$ in $K$ and the central node of the star $x_{2}$ . We have $k=1$ seed.

It is clear that the optimal seeding configuration is to seed $x_{2}$ , which gives $\mathcal{I}^{\mathrm{OPT}}=1+ps$ . However, this is not a maximin fair seeding, as $C_{1}$ receives 0 influence. Instead, seeding $x_{1}$ is maximin fair, giving $C_{1}$ $p$ influence and $C_{2}$ $1$ influence, giving a maximin utility $U^{\mathrm{Maximin}}(\{x_{1}\})=\min(p,\frac{1}{s+2})$ . In this case, $\mathcal{I}^{\mathrm{Maximin}}=1+p$ .

As $s\to\infty$ , $PoF_{\mathrm{Maximin}}=\frac{1+ps}{1+p}$ becomes unboundedly large. ∎

Theorem 3.1.

$U^{\mathrm{Maximin}}$ * and $U^{\mathrm{Rational}}$ are not submodular.*

We divide the proof of this theorem into two parts:

Conjecture A.

Maximin utility $U^{\mathrm{Maximin}}$ is not submodular.

Proof.

Let us consider a graph with 4 nodes $\{x,a,b,c\}$ where $\{x,a\}$ form community $C_{1}$ and $\{b,c\}$ form community $C_{2}$ . Let $A=\{a,b\}$ and $B=\{a,b,c\}$ be two possible seeding configurations.

Notice that $C_{1}$ receives $1$ influence in both configurations, which is weakly less than the influence received by $C_{2}$ , and so, $U^{\mathrm{Maximin}}(A)=U^{\mathrm{Maximin}}(B)=1/2$ .

Now, consider adding $x$ to the $A$ and $B$ . $U^{\mathrm{Maximin}}(A\cup\{x\})=1/2$ since $C_{2}$ remains incompletely seeded. But $U^{\mathrm{Maximin}}(B\cup\{x\})=1$ since both groups are fully seeded. ∎

Conjecture B.

Group rational utility $U^{\mathrm{Rational}}$ is not submodular.

Proof.

Recall the definition of group rational utility:

[TABLE]

Let us consider the same graph as in Conjecture A with 4 nodes $\{x,a,b,c\}$ where $\{x,a\}$ form community $C_{1}$ , and $\{b,c\}$ forms community $C_{2}$ . $k=4$ seeds are available, and so therefore the group rational constraints are only satisfied by seeding all vertices.

Let $A=\{a,b\}$ and $B=\{a,b,c\}$ . It is easy to verify that $U^{\mathrm{Rational}}(A)=U^{\mathrm{Rational}}(B)=U^{\mathrm{Rational}}(A\cup\{x\})=0$ since none of these satisfy all group rational constraints. However, $U^{\mathrm{Rational}}(B\cup\{x\})>0$ , and so therefore $f(A\cup\{x\})-f(A)<f(B\cup\{x\})-f(B)$ for $A\subseteq B$ , which contradicts the definition of submodularity.

∎

Theorem 4.4.

Give graphs $G$ with groups $C_{1}$ and $C_{2}$ , and $G^{\prime}$ with groups $C^{\prime}_{1}$ and $C^{\prime}_{2}$ , where $G^{\prime}=G$ $C^{\prime}_{1}=C_{1}$ and $C^{\prime}_{2}$ is obtained from $C_{2}$ by the addition of one vertex $x_{1}$ ( $x_{1}\in C_{1}$ , $x_{1}\notin C_{2}$ . It is possible for $\lim\limits_{\begin{subarray}{c}n\to\infty\\ p\to\frac{1}{3}^{-}\end{subarray}}\frac{PoF^{\mathrm{Maximin}}_{G^{\prime}}}{PoF^{\mathrm{Maximin}}_{G}}\to\infty$ .

Proof.

Consider a graph $G$ with two star components: $S_{1}$ with $s+1$ vertices with a central node $x_{1}$ , and $S_{2}$ with $t+2$ vertices with central node $x_{2}$ ( $s>t$ ). There are two groups: $C_{1}$ contains 2 vertices, $x_{1}$ and a non-central node from $S_{2}$ ; $C_{2}$ contains $s+t+1$ remaining vertices, including $x_{2}$ . There is one seed ( $k=1$ ), and a total of $n=s+t+3$ nodes.

It is easy to see that the Maximin configuration is to seed $x_{1}$ , which gives $C_{1}$ $1$ influence, and $C_{2}$ $ps$ influence. This gives a Maximin influence $\mathcal{I}^{\mathrm{Maximin}}_{G}=1+ps$ .111We do not need to calculate $U^{\mathrm{Maximin}}$ explicitly at any point in this proof as it is not required for the proof to work.

Now, consider a modified graph $G^{\prime}=G$ , but with our groups modified by allowing $x_{2}$ to belong to both communities. That is, $C^{\prime}_{1}=C_{1}$ and $C^{\prime}_{2}=C_{2}\cup\{x_{2}\}$ . The Maximin configuration has two possibilities: either $\{x_{1}\}$ remains the Maximin configuration, or $\{x_{2}\}$ becomes the new Maximin configuration. In order for the latter case to be true, seeding $\{x_{2}\}$ must provide higher proportional influence to the least well-off group than seeding $\{x_{1}\}$ .

Seeding $\{x_{2}\}$ generates $\frac{1+p}{3}$ influence for $C_{1}$ , and $\frac{1+pt}{s+t+1}$ influence for $C_{2}$ . Seeding $\{x_{1}\}$ generates $\frac{1}{3}$ influence for $C_{1}$ , and $\frac{ps}{s+t+1}$ . It can be shown that for $p<\frac{1}{3}$ and $t=\frac{s}{1-3p}$ , these conditions are satisfied and $\{x_{2}\}$ is the Maximin configuration, generating a total of $\mathcal{I}^{\mathrm{Maximin}}_{G^{\prime}}=1+p(t+1)$ .

Then,

[TABLE]

And therefore, as $p\to\frac{1}{3}^{-}$ , i.e. $p$ approaches $\frac{1}{3}$ from the left, the addition of a node to a second group may cause the Price of Maximin Fairness to worsen by an arbitrarily large amount. ∎

Appendix B Analysis of multiobjective submodular maximization problem

Consider a collection of monotone submodular functions $f_{1}...f_{m}$ with corresponding multilinear extensions $F_{1}...F_{m}$ . We will assume that the maximum singleton value of any item in the ground set $V$ is bounded as $f_{i}{\{v\}}\leq b$ for all $i\in[m],v\in V$ . Suppose that we are given a target value $W_{i}$ for each $f_{i}$ and would like to find a set $S$ with $|S|\leq k$ which guarantees $f_{i}(S)\geq W_{i}$ for all $i$ . We are promised that such an $S$ exists. We will give an approximation algorithm for this problem which improves in terms of both runtime and approximation ratio on the best current algorithms, given by Udwani Udwani [2018], who in turn build on the work of Chekuri et al. Chekuri et al. [2010].

Our algorithm follows the overall template of Udwani [2018], which carries out three steps (given a precision level $\epsilon$ ).

Make a pass over the ground set, maintaining a set $S_{1}$ . Add to $S_{1}$ every item which has value at least $\epsilon^{3}W_{i}$ for some $f_{i}$ . 2. 2.

Define $\mathcal{P}$ to be the uniform matroid polytope for budget $k-|S_{1}|$ . Use a subroutine to find a point $x\in\mathcal{P}$ satisfying $F_{i}(x|x_{S_{1}})\geq\alpha\left(W_{i}-f_{i}(S_{1})\right)-\epsilon$ for all $i$ and some approximation ratio $\alpha$ . This is the key step where we improve the runtime and approximation ratio. 3. 3.

Round $x$ to a set $S_{2}$ using the swap rounding algorithm of Chekuri et al. [2010] Output $S_{1}\cup S_{2}$ .

Our primary technical contribution is an algorithm for the second step which guarantees $\alpha=\left(1-\frac{1}{e}\right)$ . It uses access to three kinds of stochastic oracles for the functions and their multilinear extensions:

A stochastic value oracle for singletons $\mathcal{A}^{i}_{\text{val}}$ corresponding to each $f_{i}$ . Given an item $v$ , this oracle returns a value $\mathcal{A}^{i}_{\text{val}}(v)$ with $\operatorname*{\mathbb{E}}\left[\mathcal{A}^{i}_{\text{val}}(S)\right]=f_{i}(\{v\})$ and $\text{Var}\left[\mathcal{A}^{i}_{\text{val}}(S)\right]\leq c_{\text{val}}$ . 2. 2.

A stochastic gradient oracle $\mathcal{A}^{i}_{\text{grad}}$ for each multilinear extension $F_{i}$ . Given a point $x\in\mathcal{P}$ , $\mathcal{A}^{i}_{\text{grad}}(x)$ satisfies $\operatorname*{\mathbb{E}}\left[\mathcal{A}^{i}_{\text{grad}}\right]=\nabla_{x}F_{i}(x)$ and $\left\lVert\mathcal{A}^{i}_{\text{grad}}(x)\right\rVert_{\infty}\leq c_{\text{grad}}$ 3. 3.

A stochastic gradient oracle $\mathcal{A}^{j}_{\text{item}}$ corresponding to each item $j\in[n]$ . Given a point $x\in\mathcal{P}$ , $\mathcal{A}^{j}_{\text{item}}(x)$ satisfies $\operatorname*{\mathbb{E}}\left[\mathcal{A}^{j}_{\text{item}}(x)\right]=\left[\nabla_{x_{j}}F_{1}(x)...\nabla_{x_{j}}F_{m}(x)\right]$ and $\left\lVert\mathcal{A}^{j}_{\text{item}}(x)\right\rVert_{\infty}\leq c_{\text{item}}$ . Note that this can be simulated from the above oracle, but may sometimes admit more efficient implementations.

We now analyze this algorithm. We start by recalling a technical lemma on the smoothness of the multilinear extension:

Lemma A (Hassani et al. Hassani et al. [2017], Lemma C.1).

For any monotone submodular set function $f$ and its multilinear extension $F$ , $||\nabla F(x)-\nabla F(y)||_{\infty}\leq b||x-y||_{1}$ where $b=\max_{v\in V}f(\{v\})$ . That is, $F$ is $b$ -smooth with respect to the $\ell_{1}$ norm.

Lemma B.

$F$ * is $b$ -Lipschitz in the $\ell_{1}$ norm.*

Proof.

Recall that $\nabla_{x_{j}}F(x)=\operatorname*{\mathbb{E}}_{S\sim x}[f(S\cup\{j\})-f(S\setminus\{j\})]$ CITE, where $S\sim x$ denotes including each $j$ in $S$ independently with probability $x_{j}$ . By submodularity, $\operatorname*{\mathbb{E}}_{S\sim x}[f(S\cup\{j\})-f(S\setminus\{j\})]\leq f(\{j\})\leq b$ . Hence, $||\nabla_{x_{j}}F(x)||_{\infty}\leq b$ which proves the lemma. ∎

Next, we show a guarantee for the output of mirror descent in step 2(a).

Lemma C.

For some $x\in\mathcal{P}$ , suppose that there exists a $v^{*}\in\mathcal{P}$ such that $v^{*}\cdot\nabla F_{i}(x)\geq W_{i}-F_{i}(x)$ for all $i=1...m$ . Then, S-SP-MD returns a $v$ satisfying $v\cdot\nabla F_{i}(x)\geq(1-\epsilon)(W_{i}-F_{i}(x))-\epsilon$ for all $i$ with probability $1-\delta$ . There are $O\left(\frac{\left(c_{\text{grad}}\sqrt{k\log n}+kc_{\text{item}}\sqrt{\log n}\right)^{2}}{\epsilon^{4}}\log\frac{1}{\delta}\right)$ iterations, each requiring one call to oracles $\mathcal{A}^{i}_{\text{grad}}$ and $\mathcal{A}^{j}_{\text{item}}$ for some $i$ and $j$ , and $O(n+m)$ additional work.

Proof.

Our objective is to find a $v$ satisfying $v\cdot F_{i}(x)\geq(1-\epsilon)(W_{i}-F_{i}(x))-\epsilon$ , under the guarantee that such a $v$ exists. Note that we call S-SP-MD only on the set of indices $\mathcal{I}$ where $W_{i}-F_{i}(x)\geq\epsilon$ . For all other indices, where the current solution is already within $\epsilon$ of the target, monotonicity of the $F_{i}$ guarantees that $v\cdot F_{i}(x)\geq 0\geq W_{i}-F_{i}(x)-\epsilon$ .

The feasibility problem on the groups in $\mathcal{I}$ is equivalent to solving maxmin problem

[TABLE]

To see this, let $OPT$ denote the optimal value for the maxmin problem; we are guaranteed $OPT\geq 1$ . If we have $v$ with maxmin value at least $OPT-\epsilon$ , then $v$ satisfies

[TABLE]

We now prove that S-SP-MD produces a $v$ with maxmin value at least $OPT-\epsilon$ . Let $A$ be a matrix where column $i$ is $\frac{\nabla F_{i}(x)}{W_{i}-F_{i}(x)}$ for each $i\in\mathcal{I}$ , and define $g(v,y)=v^{\top}Ay$ . Let $\Delta(\mathcal{I})$ be the $|\mathcal{I}|$ -dimensional probability simplex. We would like to solve the problem

[TABLE]

which is easily seen to be equivalent to the original maxmin problem.

We will solve the above saddle point problem by running stochastic saddle point mirror descent with the negative entropy mirror map on the function $g$ . We obtain stochastic estimates of $\nabla_{v}g(v,y)$ and $\nabla_{y}g(v,y)$ via calls to input the oracles. First, note that

[TABLE]

where $i\sim y$ denotes drawing index $i$ with probability $y_{i}$ (recall that $y\in\Delta(\mathcal{I})$ is a probability distribution). Hence, we can obtain an estimate $\hat{\nabla_{v}}$ of $\nabla_{v}g(v,y)$ by sampling $i\sim y$ and returning $\frac{1}{W_{i}-F_{i}(x)}\mathcal{A}^{i}_{\text{grad}}$ . We are guaranteed $\left\lVert\hat{\nabla_{v}}\right\rVert_{\infty}\leq\frac{c_{\text{grad}}}{W_{i}-F_{i}(x)}\leq\frac{c_{\text{grad}}}{\epsilon}$ . We take a similar strategy for $\nabla_{y}g(v,y)$ : $v^{\top}A=k\left(\frac{1}{k}v\right)\hat{A}=k\operatorname*{\mathbb{E}}_{j\sim\frac{1}{k}v}[v_{j}A_{j}]$ (since $\frac{1}{k}v_{j}$ is a probability distribution). Hence, we can sample $j\sim\frac{1}{k}v$ and return $\hat{\nabla}_{y}=k\cdot\text{diag}\left(\frac{1}{\vec{W}-\vec{F}(x)}\right)\mathcal{A}^{j}_{\text{item}}(x)$ . This satisfies $\left\lVert\hat{\nabla}_{y}\right\rVert_{\infty}\leq\frac{k}{\epsilon}c_{\text{item}}$ .

Note that we can bound the diameter of $\mathcal{P}$ with respect to the mirror map by $\sqrt{k\log n}$ (see Hassani et al. [2017]) and the diameter of $\Delta^{m}$ by $\sqrt{\log m}$ (see Nemirovski et al. [2009]). We will run mirror descent for $T^{\prime}$ iterations. Let $\bar{x}=\frac{1}{T^{\prime}}\sum_{t=1}^{T^{\prime}}x^{t}$ and $\bar{y}=\frac{1}{T^{\prime}}\sum_{t=1}^{T^{\prime}}y^{t}$ . Now applying Proposition 3.2 of Nemirovski et al. Nemirovski et al. [2009] implies that after $T^{\prime}$ iterations we have

[TABLE]

and so taking $T^{\prime}=O\left(\frac{\left(c_{\text{grad}}\sqrt{k\log n}+kc_{\text{item}}\sqrt{\log n}\right)^{2}\log\frac{1}{\delta}}{\epsilon^{4}}\right)$ ensures that

[TABLE]

holds with probability at least $1-\delta$ .

∎

Theorem D.

Suppose that there exists some $x\in\mathcal{P}$ satisfying $F_{i}(x)\geq W_{i}$ for all $i=1...m$ . Then, after $T=\frac{bk^{2}}{\epsilon}$ iterations, the algorithm returns a point $x^{T}$ satisfying $F_{i}(x^{T})\geq\left(1-\epsilon\right)\left(1-\frac{1}{e}\right)W_{i}-\epsilon$ for all $i$ . Each iteration requires one call to mirror descent at success probability $\delta^{\prime}=\frac{\delta\epsilon}{bk^{2}}$ and precision level $\epsilon^{\prime}=\frac{\epsilon}{2}$ , $O(m)$ $\epsilon$ -accurate value oracle calls, and $O(n)$ additional work.

Proof.

We analyze the progress that the algorithm makes with respect to each $F_{i}$ over a single step $t$ . Using the guarantee for the subroutine mirror descent (run with a precision level $\epsilon_{1}$ to be set below), and assuming that the values $\{W_{i}\}$ are feasible, we have with probability at least $1-\delta$

[TABLE]

which implies

[TABLE]

and so after $T$ steps

[TABLE]

holds with probability at least $1-T\delta$ via union bound. Taking $\epsilon_{1}=\frac{\epsilon}{2}$ , $T=\frac{bk^{2}}{\epsilon}$ , and running mirror descent with success probability $\frac{\delta}{T}$ at each iteration ensures that

[TABLE]

holds for all $i$ with probability at least $1-\delta$ , which completes the guarantee for the solution quality. To obtain the bound on additional work done by the algorithm, we note that the only operation performed besides calling mirror descent is adding $v^{t}$ to the current iterate, which takes time $O(n)$ . ∎

Theorem E.

Given a feasible set of target values $W_{1}...W_{n}$ , Algorithm 1 outputs a set $S$ such that $f_{i}(S)\geq(1-\epsilon)\left(1-\frac{m}{k(1+\epsilon^{\prime})\epsilon^{3}}\right)\left(1-\frac{1}{e}\right)W_{i}-\epsilon$ with probability at least $1-\delta$ . Asymptotically as $k\to\infty$ , the approximation ratio can be set to approach $1-1/e$ so long as $m=o(k\log^{3}k)$ . The algorithm requires $O(nm)$ $\epsilon^{\prime}$ -accurate value oracle calls, $O(m\frac{bk^{2}}{\epsilon}\log\frac{1}{\delta})$ $\epsilon$ -accurate value oracle calls, $O\left(\frac{bk^{4}c^{2}}{\epsilon^{5}}\log\left(n+\frac{bk}{\delta\epsilon}\right)\right)$ calls to $\mathcal{A}_{\text{grad}}$ and $\mathcal{A}_{\text{item}}$ , and $O\left(\frac{nk^{2}b^{2}}{\epsilon^{2}}+\frac{mk^{2}b}{\epsilon}+\frac{k^{3}b^{2}}{\epsilon^{2}}\right)$ additional work.

Proof.

ThresholdInclude produces a set $S_{1}$ for which each item $j\in S_{1}$ satisfies $f_{i}(\{j\})\geq W_{i}(1+\epsilon^{\prime})\epsilon^{3}$ for some $i$ , and any $j\not\in S_{1}$ satisfies $f_{i}(\{j\})\leq W_{i}\epsilon^{3}$ for all $i$ . Note that there can be at most $\frac{1}{(1+\epsilon^{\prime})\epsilon^{3}}$ items with $f_{i}(\{j\})\geq W_{i}(1+\epsilon^{\prime})\epsilon^{3}$ for any given $i$ (combining submodularity with our WLOG assumption that $f_{i}$ is upper bounded by $W_{i}$ ). Hence, $|S_{1}|\leq\frac{m}{(1+\epsilon^{\prime})\epsilon^{3}}$ . Define $k_{1}=k-|S_{1}|$ .

Now we lower bound the marginal gain of the fractional vector $x$ returned by MultiobjectiveFW. So long as the target values $\{\frac{k_{1}}{k}\left(W_{i}-f_{i}(S_{1})\right)\}$ are feasible, we are guaranteed that $F_{i}(x|S_{1})\geq\frac{k_{1}}{k}\left(1-\frac{1}{e}\right)\left(W_{i}-f_{i}(S_{1})\right)-\epsilon$ . for all $i$ . To see feasibility, let $S^{*}$ be the promised set satisfying the overall feasibility problem (i.e., $f_{i}(S^{*})\geq W_{i}$ for all $i$ ). Let $x_{S}$ denote the indicator vector of the set $S$ . We have that $|S^{*}\setminus S_{1}|\leq k$ , and $F_{i}(x_{S^{*}\setminus S_{1}}|x_{S_{1}})=f_{i}(S^{*}|S_{1})\geq W_{i}-f_{i}(S_{1})$ . Using Corollary 3 of Udwani [2018], the point $x^{\prime}=\frac{k_{1}}{k}x_{S^{*}\setminus S_{1}}$ satisfies $F_{i}(x^{\prime}|x_{S_{1}})\geq\frac{k_{1}}{k}(W_{i}-f_{i}(S_{1}))$ . $x^{\prime}$ is also feasible for the continuous problem since $||x^{\prime}||_{1}\leq k_{1}$ . Now applying Theorem D guarantees that $F_{i}(x|S_{1})\geq\frac{k_{1}}{k}\left(1-\frac{1}{e}\right)\left(W_{i}-f_{i}(S_{1})\right)-\epsilon$ with probability at least $1-\delta$ .

Lastly, we need to handle the rounding process. We first take the point $x$ and approximately decompose it into a convex combination of integral points of $\mathcal{P}$ . This is done using the algorithm of Mirrokni et al. Mirrokni et al. [2017], which produces a point $x_{\text{int}}$ satisfying $||x_{\text{int}}-x||_{1}\leq\epsilon$ along with a decomposition of $x_{\text{int}}$ into $O(\frac{k^{2}}{\epsilon^{2}})$ integral points of $\mathcal{P}$ (Mirrokni et al. [2017], Proposition 5.1). If we run this algorithm with precision level $\frac{\epsilon}{b}$ , Lemma B guarantees that $|F_{i}(x_{\text{int}})-F_{i}(x)|\leq\epsilon$ for all $i$ and hence $F_{i}(x_{\text{int}}|S_{1})\geq\frac{k_{1}}{k}\left(1-\frac{1}{e}\right)\left(W_{i}-f_{i}(S_{1})\right)-2\epsilon$ . Applying Lemma 2 of Udwani [2018] (who summarize the guarantee for swap rounding proved by Chekuri et al. [2010]), carrying out $O\left(\log\frac{1}{\delta}\right)$ iterations of swap rounding and taking the best outcome produces a set $S_{2}$ which satisfies $f(S_{2}|S_{1})\geq(1-\epsilon)\frac{k_{1}}{k}\left(1-\frac{1}{e}\right)\left(W_{i}-f_{i}(S_{1})\right)-3\epsilon$ with probability at least $1-\delta$ , provided that the best outcome is determined by calling a value oracle with precision level $\epsilon$ . Adding up the final guarantee, we have

[TABLE]

and now rescaling $\epsilon$ by a factor $\frac{1}{3}$ gives the final approximation guarantee. The asymptotic $1-1/e$ approximation follows by setting $\epsilon$ as in Udwani [2018].

We now add up the final runtime. The first thresholding step requires $n$ value oracle calls to each of the $m$ objectives at precision level $\epsilon^{\prime}$ . MultiobjectiveFW requires $\frac{bk^{2}}{\epsilon}$ iterations, each of which calls mirror descent once. Each invocation of mirror descent requires a total of $O\left(\frac{1}{\epsilon^{4}}\left(c_{\text{grad}}\sqrt{k\log n}+c_{\text{item}}k\sqrt{\log n}\right)^{2}\log\frac{bk}{\delta\epsilon}\right)$ oracle calls. Recalling that $c=\max\{c_{\text{item}},c_{\text{grad}}\}$ , this is upper bounded by $O\left(\frac{c^{2}k^{2}}{\epsilon^{4}}\log\left(n+\frac{bk}{\delta\epsilon}\right)\right)$ . Each iteration of MultiobjectiveFW also uses $m$ value oracle calls at precision level $\epsilon$ . Finally, each iteration uses additional $O(n+m)$ overhead, for a total of $O\left(\frac{(n+m)k^{2}b}{\epsilon}\right)$ . In the rounding procedure, we first need to involve ApproximateCaratheodory with precision level $\frac{\epsilon}{b}$ , which per Proposition 5.1 of Mirrokni et al. [2017] requires $\frac{k^{2}b^{2}}{\epsilon}$ iterations, and one linear maximization over $\mathcal{P}$ per iteration. Since $\mathcal{P}$ is the uniform matroid polytope, each linear maximization takes time $O(n)$ , and so this stage contributes time $O\left(\frac{nk^{2}b^{2}}{\epsilon}\right)$ . Lastly, we have the $O\left(\log\frac{1}{\delta}\right)$ iterations of swap rounding. Since $x_{\text{int}}$ was decomposed into $\frac{k^{2}b^{2}}{\epsilon^{2}}$ integral points, swap rounding takes time $\frac{k^{3}b^{2}}{\epsilon^{2}}$ for each iteration Chekuri et al. [2010]. We also need one $\epsilon$ -accurate value oracle call to each of the objective functions per iteration so that we can select the (approximately) best set. Combining these bounds results in the final stated runtime. ∎

Appendix C Efficient stochastic gradient estimates

We now give efficient implementations for the oracles $\mathcal{A}_{\text{grad}}$ and $\mathcal{A}_{\text{item}}$ . They run in combined time $O\left(k\left(|V|+|E|\right)\log^{2}\frac{|V|}{\delta}\right)$ time, where the operation succeeds with probability $1-\delta$ . Our implementations guarantee $c\leq 2b$ whenever they succeed.

We use a representation the influence maximization objective as the expectation over a set of deterministic submodular functions. Specifically, we can view the independent cascade model as specifying a distribution over live-edge graphs Kempe et al. [2003] where each edge is present with probability $p$ and absent otherwise (where all events are independent). Let $\xi$ denote a graph realized from this process, which we will denote $\xi\sim P$ . For a fixed $\xi$ , the influence spread of a given seed set $S$ is just the number of nodes which are reachable from $S$ via only the edges present in $\xi$ . We will denote this quantity by $f(S,\xi)$ , where $f(S)=\operatorname*{\mathbb{E}}_{\xi\sim P}[f(S,\xi)]$ .

The starting point is to recall that for any group’s utility function $f_{i}$ , the gradients of the multilinear extension $F_{i}$ satisfy

[TABLE]

which follows from the definition of the multilinear extension Chekuri et al. [2010]. Note that for any fixed $i$ and $x_{j}$ , we can obtain a stochastic estimate of this quantity in time $O(|V|+|E|)$ by first drawing a set $S\sim x$ , simulating the cascade process, and counting the number of of nodes reached with and without item $j$ . By submodularity, the resulting estimate satisfies $f(S\cup\{j\},\xi)-f(S\setminus\{j\},\xi)\leq b$ for any $S$ and $\xi$ . Naively repeating this process over all $i,j$ would hence require time $O(|V|(|V|+|E|)m)$ . We now show how to implement the required oracles by drawing a number of samples that scales only with $k\log|V|$ instead of $|V|$ .

Implementing $\mathcal{A}_{\text{item}}$ is simpler because we only need to estimate $\left[\nabla_{x_{j}}F_{1}(x)...\nabla_{x_{j}}F_{m}(x)\right]$ for a single fixed $x_{j}$ . Hence, we can draw a single $S,\xi$ , count the number of nodes reachable in each group under $\xi$ with set $S\setminus\{j\}$ , and then count the number of nodes reachable with set $S\cup\{j\}$ . This takes time $O\left(|V|+|E|\right)$ .

Efficiently implementing $\mathcal{A}_{\text{grad}}$ is more difficult since we need to simultaneously estimate $\nabla F_{i}$ with respect to every $x_{j}$ ; hence, naive enumeration would take $O(|V|^{2})$ time. We now detail our strategy. We start by considering a given sample $(S,\xi)$ and show how to estimate the marginal contribution $f_{i}(S\cup\{j\},\xi)-f_{i}(S,\xi)$ for a given $i$ and and all $j\not\in S$ in total runtime $O\left(\left(|V|+|E|\right)\log\frac{|V|}{\delta}\right)$ . We first remove all nodes from $G$ that are reachable from $S$ under $\xi$ , which takes time $O\left(|V|+|E|\right)$ . Any node removed in this stage has marginal contribution 0. Next, we remove all nodes that are isolated in the remaining subgraph and assign them marginal contribution 1 if they are part of group $i$ . This stage takes time $O(|V|)$ .

Now we deal with the remaining nodes. Here, determining their marginal contribution of node $v$ to group $i$ amounts to estimating the number of nodes of group $i$ which are reachable from $v$ in $\xi$ . We use the size estimation framework of Cohen Cohen [1997], which allows us to simultaneously produce an unbiased estimate of every remaining node’s contribution to group $i$ in time $O\left(|E|\right)$ . We apply the weighted version of the estimator, where every node in group $i$ has weight 1 and all other nodes have weight 0. We take $O(\left(\log\frac{|V|}{\delta}\right)$ independent repetitions of the estimation process, resulting in $O\left(|E|\log\frac{|V|}{\delta}\right)$ runtime. For a given group $i$ , and using $\ell$ repetitions, Cohen’s estimator produces an estimate $\Delta(v)$ for each node which satisfies

$\operatorname*{\mathbb{E}}[\Delta(V)]=f_{i}(\{v\}|S)$ 2. 2.

$\Pr\left[|\Delta(v)-f_{i}(\{v\}|S)|\geq\epsilon f_{i}(\{v\}|S)\right]\leq e^{-\Omega\left(\epsilon^{2}\ell\right)}$ for any $0\leq\epsilon\leq 1$

We fix $\epsilon=1$ as an arbitrary constant and use $\ell=O\left(\log\frac{|V|}{\delta}\right)$ . This allows us to use union bound combined with the second property of the estimator to argue that over all nodes combined

[TABLE]

and so the resulting gradients will satisfy our stated bounds on $c_{\text{item}}$ and $c_{\text{grad}}$ with high probability.

Our overall strategy is to generate enough samples that every node is missing from $S$ in at least one of them. Then, we can use a node’s marginal contribution in the sample from which it missing as its gradient estimate. Note that a node $j$ is absent from any given sample with probability $1-x_{j}$ . Given budget $k$ , at most $\frac{k}{1-\frac{1}{k+1}}=k+1$ nodes can have $x_{j}\geq 1-\frac{1}{k+1}$ . For any such node, we can explicitly estimate a sample of Equation 1 using $O\left(|V|+|E|\right)$ time per node, for $O\left(k\left(|V|+|E|\right)\right)$ total. For the remaining nodes, a simple argument shows that taking $(k+1)\log\frac{|V|}{\delta}$ samples is sufficient to ensure that each node is missing from at least one sample with combined probability $1-\delta$ . Summing up, the total runtime to implement $\mathcal{A}_{\text{grad}}$ is $O\left(k\left(|V|+|E|\right)\log^{2}\frac{|V|}{\delta}\right)$ .

Appendix D Runtime comparison with previous work

The best previous algorithm for multiobjective submodular maximization Udwani [2018] uses the same overall framework as us, but uses a MWU algorithm for the second stage (the continuous maximization problem). The MWU algorithm runs $O\left(\frac{m}{\epsilon^{2}}\right)$ iterations, where each iteration requires a call to a greedy algorithm that maximizes a weighted combination of the $f_{i}$ . Using the best implementation of the greedy algorithm Badanidiyuru and Vondrák [2014]222While there are efficient special-purpose techniques for influence maximization on a given graph, it is not obvious how to adapt them to deal with the weighted combination of group objectives. requires $O\left(\frac{n}{\epsilon}\log\frac{n}{\epsilon}\right)$ value oracle calls, for $O\left(\frac{n}{\epsilon^{3}}\log m\log\frac{n}{\epsilon}\right)$ such calls in total. By comparison, our algorithm accesses the function through calls to the gradient oracles $\mathcal{A}_{\text{item}}$ and $\mathcal{A}_{\text{grad}}$ . It makes a number of calls to these oracles which is only logarithmic in $n$ , scaling as $O\left(\frac{bc^{2}k^{4}}{\epsilon^{3}}\log\left(n+\frac{bk}{\delta\epsilon}\right)\right)$ . Since gradient oracle calls can typically be implemented in similar asymptotic runtime to value oracle calls for common classes of functions (as we have demonstrated for influence maximization), our algorithm effectively saves a factor $O(n)$ runtime in exchange for worse dependence on $k$ and $b$ . Since we expect $n$ to grow much faster than $k$ or $b$ (in many typical applications, $b$ is a small constant Hassani et al. [2017]), this is often an improvement in asymptotic runtime. For influence maximization in particular, it is easy to see that a value oracle call for a given group cannot be implemented in less than $O(|V|+|E|)$ time, which matches (up to log factors) our stochastic gradient oracle’s dependence on the graph size.

Bibliography32

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Aghaei et al. [2019] S. Aghaei, M.J. Azizi, and P. Vayanos. Learning optimal and fair decision trees for non-discriminative decision-making. In Proc. of the 33rd AAAI , 2019.
2Ahmed et al. [2017] F. Ahmed, J. P. Dickerson, and M. Fuge. Diverse weighted bipartite b 𝑏 b -matching. In Proc. of the 26th IJCAI , pages 35–41, 2017.
3Badanidiyuru and Vondrák [2014] A. Badanidiyuru and J. Vondrák. Fast algorithms for maximizing submodular functions. In Proc. of the 25th SODA , pages 1497–1514, 2014.
4Banerjee et al. [2013] A. Banerjee, A. Chandrasekhar, E. Duflo, and M. O. Jackson. The diffusion of microfinance. Science , 341(6144), 2013.
5Barman et al. [2019] S. Barman, A. Biswas, S. K. Krishnamurthy, and Y. Narahari. Groupwise maximin fair allocation of indivisible goods. In Proc. of the 32nd AAAI , 2019.
6Barocas and Selbst [2016] S. Barocas and A. Selbst. Big data’s disparate impact. California Law Review , 104:671, 2016.
7Benabbou et al. [2018] N. Benabbou, M. Chakraborty, V. Ho, J. Sliwinski, and Y. Zick. Diversity constraints in public housing allocation. In Proc. of the 17th AAMAS , pages 973–981, 2018.
8Bredereck et al. [2018] R. Bredereck, P. Faliszewski, A. Igarashi, M. Lackner, and P. Skowron. Multiwinner elections with diversity constraints. In Proc. of the 32nd AAAI , pages 933–940, 2018.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Code & Models

Videos

Group-Fairness in Influence Maximization

Abstract

1 Introduction

Our Contributions:

Related Work:

2 Model

Diversity:

Influence maximization:

Maximin Fairness:

Diversity Constraints:

Price of Fairness:

3 Optimization

Theorem 3.1**.**

Theorem 3.2**.**

4 Price of Fairness

Theorem 4.1**.**

Proof.

Theorem 4.2**.**

Theorem 4.3**.**

Proof.

Theorem 4.4**.**

5 Experimental results

6 Conclusions

7 Appendix

Appendix A Price of fairness

Theorem 4.2.

Proof.

Theorem 3.1.

Conjecture A**.**

Proof.

Conjecture B**.**

Proof.

Theorem 4.4.

Proof.

Appendix B Analysis of multiobjective submodular maximization problem

Lemma A** (Hassani et al. Hassani et al. [2017], Lemma C.1).**

Lemma B**.**

Proof.

Lemma C**.**

Proof.

Theorem D**.**

Proof.

Theorem E**.**

Proof.

Appendix C Efficient stochastic gradient estimates

Appendix D Runtime comparison with previous work

Theorem 3.1.

Theorem 3.2.

Theorem 4.1.

Theorem 4.2.

Theorem 4.3.

Theorem 4.4.

Conjecture A.

Conjecture B.

Lemma A (Hassani et al. Hassani et al. [2017], Lemma C.1).

Lemma B.

Lemma C.

Theorem D.

Theorem E.