Election Control through Social Influence with Unknown Preferences
Mohammad Abouei Mehrizi, Federico Cor\`o, Emilio Cruciani, Gianlorenzo, D'Angelo

TL;DR
This paper explores election control via social influence with voters having uncertain preferences, proposing models where influence alters probability distributions and analyzing the approximability of maximizing a candidate’s victory chances.
Contribution
It introduces two models for social influence with unknown voter preferences and analyzes their computational complexity and approximation possibilities.
Findings
First model is hard to approximate under Gap-ETH
Second model admits a constant factor approximation
Provides theoretical bounds for election control with uncertain preferences
Abstract
The election control problem through social influence asks to find a set of nodes in a social network of voters to be the starters of a political campaign aiming at supporting a given target candidate. Voters reached by the campaign change their opinions on the candidates. The goal is to shape the diffusion of the campaign in such a way that the chances of victory of the target candidate are maximized. Previous work shows that the problem can be approximated within a constant factor in several models of information diffusion and voting systems, assuming that the controller, i.e., the external agent that starts the campaign, has full knowledge of the preferences of voters. However this information is not always available since some voters might not reveal it. Herein we relax this assumption by considering that each voter is associated with a probability distribution over the candidates.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Election Control through Social Influence
with Unknown Preferences
Mohammad Abouei Mehrizi
Gran Sasso Science Institute
L’Aquila, Italy
Federico Corò
Sapienza University of Rome
Rome, Italy
Emilio Cruciani
Inria, I3S Lab, UCA, CNRS
Sophia Antipolis, France
Gianlorenzo D’Angelo
Gran Sasso Science Institute
L’Aquila, Italy
Abstract
The election control problem through social influence asks to find a set of nodes in a social network of voters to be the starters of a political campaign aiming at supporting a given target candidate. Voters reached by the campaign change their opinions on the candidates. The goal is to shape the diffusion of the campaign in such a way that the chances of victory of the target candidate are maximized. Previous work shows that the problem can be approximated within a constant factor in several models of information diffusion and voting systems, assuming that the controller, i.e., the external agent that starts the campaign, has full knowledge of the preferences of voters. However this information is not always available since some voters might not reveal it. Herein we relax this assumption by considering that each voter is associated with a probability distribution over the candidates. We propose two models in which, when an electoral campaign reaches a voter, this latter modifies its probability distribution according to the amount of influence it received from its neighbors in the network. We then study the election control problem through social influence on the new models: In the first model, under the Gap-ETH, election control cannot be approximated within a factor better than , where is the number of voters; in the second model, which is a slight relaxation of the first one, the problem admits a constant factor approximation algorithm.
1 Introduction
Social media play a fundamental role in everyone’s life providing information, entertainment, and learning. Many social media users prefer to access social network platforms such as Facebook or Twitter before news websites as they provide faster means for information diffusion [MS18]. As a consequence, online social networks are also exploited as a tool to alter users’ opinions. The extent to which the opinions of an individual are conditioned by social interactions is called social influence. It has been observed that social influence starting from a small set of individuals may generate a cascade effect that allows to reach a large part of the network. Recently, this capability has been used to affect the outcome of political elections. There exists evidence of political intervention which shows the effect of social media manipulation on the elections outcome, e.g., by spreading fake news [PCR18]. A real-life example is in the 2016 US election where a study showed that on average 92% of people remembered pro-Trump fake news and 23% of them remembered pro-Clinton fake news [AG17]. Several other cases have been studied [BFJ*+*12, Fer17, Kre16, SBLS18].
There exists a wide literature about manipulation of voting systems; we point the reader to a recent survey [FR16]. Despite that, only few studies focus on the problem of controlling the outcome of political elections through the spread of information in social networks. The election control problem [WV18] consists in selecting a set of nodes of a network to be the starters of a diffusion with the aim of maximizing the chances for a target candidate to win an election. In particular, in the constructive election control problem, the goal is to maximize the Margin of Victory (MoV) of the target candidate on its most critical opponent, i.e., the difference of votes (or score, depending on the voting system) between the two candidates after the effect of social influence. A variation of the problem, known as destructive election control, aims at making a target candidate lose. Both problems have been originally analyzed under the Independent Cascade Model (ICM) [KKT15], and considering plurality voting; approximation and hardness-of-approximation results are provided [WV18]. Corò et al. [CCDP19a, CCDP19b] analyzed the problem in arbitrary scoring rules voting systems under the Linear Threshold Model (LTM) [KKT15], providing constant factor approximation algorithms. It has been later shown that it is -hard to find any constant factor approximation in the multi-winner scenario [AD20].
Faliszewski et al. [FGKT18] examine bribery in an opinion diffusion process with voter clusters: each node is a cluster of voters, represented as a weight, with a specific list of candidates; there is an edge between two nodes if they differ by the ordering of a single pair of adjacent candidates. The authors show that making a specific candidate win in their model is -hard and fixed-parameter tractable with respect to the number of candidates. Bredereck et al. [BE17] studied the problem of manipulating diffusion on social networks, though not specifically in the context of elections. They show that identifying successful manipulation via bribing, adding/deleting edges, or controlling the order of asynchronous updates are all computationally hard problems. A similar approach is taken by Apt et al. [AM14], where the authors introduce a threshold model for social networks in order to characterize the role of social influence in the global adoption of a commercial product.
Contribution.
In all previous works it is assumed that the controller knows the preference list of each voter. However, this assumption is not always satisfied in realistic scenarios as voters may not reveal their preferences to the controller. Herein, in Section 2, we introduce two new models, Probabilistic Linear Threshold Ranking (PLTR) and Relaxed-PLTR (R-PLTR), that encompass scenarios where the preference lists of the voters are not fully revealed. Specifically, we use an uncertain model in which the controller only knows, for each voter, a probability distribution over the candidates. In fact, in applied scenarios, the probability distribution could be inferred by analyzing previous social activity of the voters, e.g., re-tweets or likes of politically oriented posts. We envision that some given focused news about a target candidate spread through the network as a message. We model such a diffusion via the LTM [KKT15]. The message will have an impact on the opinions of voters who received it from their neighbors, leading to a potential change of their vote if the neighbors exercise a strong influence on them. With this intuition in mind, in our models, the probability distribution of the voters reached by the message is updated as a function of the degree of influence that the senders of the message have on them. The rationale is that the controller, without knowing the exact preference list which is kept hidden, can just update its estimation on it by considering the mutual degree of influence among voters. We acknowledge that our models do not cover all scenarios that can arise in election control, e.g., messages about multiple candidates. However they represent a first step towards modeling uncertainty.
We study on our models both the constructive and destructive election control problems. We show in Section 3 that the election control problem in PLTR is at least as hard to approximate as the Densest--Subgraph problem [Man17]. This result implies several conditional hardness of approximation bounds for our problem, for example it cannot be approximated within any constant factor, unless the Unique Game Conjecture holds and it cannot be approximated to within any polynomial factor if the Exponential Time Hypothesis holds. However, these hardness of approximation bounds do not hold for the election control problem in R-PLTR, for which we can show that the problem remains -hard.
In Section 4 we provide an algorithm that guarantees a constant factor approximation to the constructive and destructive election control problems in R-PLTR. In the relaxed model, R-PLTR, also “partially-influenced” nodes change their probability distribution. Although this simple modification is enough to make the problem substantially easier, preliminary experimental results show that the hardness of approximation for PLTR is purely theoretical and is due to hard instances in the reduction.
In Section 5 we present the simulation of our models and algorithm on two real-world datasets.
2 Influence Models and Problem Statement
Background.
Influence Maximization is the problem of finding a subset of the most influential users in a social network with the aim of maximizing the spread of information given a particular diffusion model. In this work, we focus on the diffusion model known as Linear Threshold Model (LTM) [KKT15]. Given a graph , each edge has a weight , each node has a threshold sampled uniformly at random and independently from the others, and the sum of the weights of the incoming edges of is . Each node can be either active or inactive. Let be a set of initially active nodes and be the set of nodes active at time . A node becomes active if the sum of the incoming active weights at time is greater than or equal to its threshold , i.e., if and only if or .
The process terminates at the first time in which the set of active nodes would not change in the next round, i.e., . We define the eventual set of active nodes as and the expected size of as . Given a budget , the influence maximization problem consists in finding a set of nodes of size , called seeds, in such a way that is maximum.
Kempe et al. [KKT15] showed that the distribution of active nodes , for any set , is equal to the distribution of the sets of nodes that are reachable from in the set of random graphs called live-edge graphs. A live-edge graph is a subgraph in which each node has at most one incoming edge. Even if the number of live-edge graphs is exponential, by using standard Chernoff-Hoeffding bounds, it is possible to compute a -approximation of , for a given , with high probability by sampling a polynomial number of live-edge graphs. Moreover, is monotone and submodular w.r.t. to the initial set ; hence, an optimal solution can be approximated to a factor of using a simple greedy algorithm [NWF78]. There has been intensive research on the problem in the last decade. We point the reader to a recent survey on the topic [LFWT18].
Notation.
Let be a directed graph representing a social network of voters and their interactions. We denote the set of candidates running for the election as and the target candidate as . Each node has a probability distribution over the candidates , where is the probability that votes for candidate ; then for each we have that for each candidate and . Moreover, we denote by and , respectively, the sets of incoming and outgoing neighbors for each node . For each candidate , we assume that is at least a polynomial fraction of the number of voters, i.e., for some constant .111The assumption is used in the approximation results, since Influence Maximization problem with exponential (or exponentially small) weights on nodes is an open problem. However, the assumption is realistic: Current techniques to estimate such parameters generate values linear in the number of messages shared by a node. Let be an indicator random variable, where if votes for , with probability , and otherwise. We define the expected score of a candidate as the expected number of votes that obtains from the voters
PLTR Model.
As in LTM, each node has a threshold ; each edge has a weight , that models the influence of node on , with the constraint that, for each node , . We assume the weight of each existing edge not to be too small, i.e., for some constant .44footnotemark: 4
Given an initial set of seed nodes , the diffusion process proceeds as in LTM: Inactive nodes become active if the sum of the weights of incoming edges from active neighbors is greater than or equal to their threshold. Mainly, we are modeling the spread of some ads/news about the target candidate: Active nodes receive the message and spread it to their neighbors. Moreover, in PLTR, active nodes are influenced by the message, increasing their probability of voting for the target candidate. In particular, an active node increases the probability of voting for by an amount equal to the sum of the weights of its edges incoming from other active nodes, i.e., it adds to the initial probability . Then it normalizes to maintain as a probability distribution. Formally, for each node , where is the set of active nodes at the end of LTM, the preference list of is denoted as and it is equal to:
[TABLE]
for each . All inactive nodes will have for all candidates, including . As for the expected score before the process, we can compute the expected final score of a candidate as
[TABLE]
where is the indicator random variable after the process, i.e., if votes for , with probability , and otherwise.
Let us denote by the set of all possible live-edge graphs sampled from . We can also compute by means of live-edge graphs used in the LTM model as
[TABLE]
where is the score of in and is the probability of sampling live-edge . More precisely, for the target candidate we have
[TABLE]
where is the set of nodes reachable from in . A similar formulation can be derived for .
R-PLTR Model.
In the next section we prove that the election control problem in PLTR is hard to approximate to within a polynomial fraction of the optimum (Theorem 1). However, we show that a small relaxation of the model allows us to approximate it to within a constant factor. In the relaxed model, that we call Relaxed Probabilistic Linear Threshold Ranking (R-PLTR), the probability distribution of a node is updated if it has at least an active incoming neighbor (also if the node is not active itself). More formally, every node (and not just every node as in PLTR) changes its preference by updating its probability distribution via Eq. (1); thus also nodes that have at least an active incoming neighbor can change. The rationale is that a voter might slightly change its opinion about the target candidate if it receives some influence from its active incoming neighbors even if the received influence is not enough to activate it (thus making it propagate the information to its outgoing neighbors). Therefore, we include this small amount of influence in the objective function. In the next section, we show that election control in R-PLTR is still -hard, and then we give an algorithm that guarantees a constant approximation ratio in this setting.
Problem Statement.
In the constructive election control problem we maximize the expected Margin of Victory (MoV) of the target candidate w.r.t. its most voted opponent, akin to [CCDP19a, WV18]. We define the obtained starting from as the expected increase, w.r.t. the value before the process, of the difference between the score of and that of the most voted opponent.222The increment in margin of victory, instead of just the margin, cannot be negative and gives well defined approximation ratios. Formally, if and are respectively the candidates different from with the highest score before and after the diffusion process
[TABLE]
Given a budget , the constructive election control problem asks to find a set of seed nodes , of size at most , that maximizes . It is worth noting that MoV can also be expressed as a function of the score gained by candidate and the score lost by its most voted opponent at the end of the process. We define the score gained and lost by a candidate as
[TABLE]
Therefore, we can rewrite as
[TABLE]
The destructive election control problem, instead, aims at making the target candidate lose by minimizing its MoV. In this dual scenario, the probability distributions of the voters are updated slightly differently in our models, i.e., influenced voters have a lower probability of voting for the target candidate mimicking the spread of “negative” news about .
Influencing Voters About Other Candidates.
In our model the controller can send to the seed nodes a message in support of only one single candidate, e.g., latest news about the candidate. We prove that the best strategy is that of sending messages in support of the target candidate , i.e., if the controller wants to win, then, according to our models, the direct strategy of targeting voters with news about is more effective than the alternative strategy of distracting the same voters with news about other candidates.
Indeed, it is not always sufficient to maximize the score of the target candidate to ensure his victory or to maximize the margin of victory, and it is easy to find counter-examples of this strategy. Moreover, in the models of Wilder et al. [WV18] and Corò et al. [CCDP19a] it could be convenient to increase the score of a third candidate in order to make the most voted opponent w.r.t. lose score and favor .
However, as previously claimed, in our models this does not hold. In fact, we can distinguish between the three possible strategies:
- •
: Influencing voters about .
- •
: Influencing voters about , i.e., the most voted opponent w.r.t. at the end of the process.
- •
: Influencing voters about any other candidate .
Let us now analyze the MoV of in these three different cases. As described in Equation (4), a general formulation for MoV is the following
[TABLE]
where is the initial set of seed nodes and is the sum of constant terms that are not modified by the process. With some algebra, it is possible to compute the MoV of in such scenarios, getting the following formulations:
[TABLE]
We just need to observe that and that to conclude that it is always convenient to influence the voters about the target candidate whenever you want to maximize the MoV of . Therefore, in the remainder of the paper, we only focus on changing the score of the target candidate . Note that the observations above hold both for PLTR and R-PLTR.
3 Hardness Results
In this section we provide two hardness results related to election control in PLTR and R-PLTR. In Theorem 1 we show that maximizing the MoV in PLTR is at least as hard to approximate as the Densest--subgraph problem. This implies several conditional hardness of approximation bounds for the election control problem. Indeed, it has been shown that the Densest--subgraph problem is hard to approximate: to within any constant bound under the Unique Games with Small Set Expansion conjecture [RS10]; to within , for some constant , under the exponential time hypothesis (ETH) [Man17]; to for any function , under the Gap-ETH assumption [Man17]. Then, in Theorem 2, we show that maximizing the MoV in R-PLTR is still -hard.
Theorem 1**.**
An -approximation algorithm to the election control problem in PLTR gives an -approximation to the Densest -Subgraph problem, for a positive constant .
Proof.
Given an undirected graph and an integer , Densest -Subgraph (DkS) is the problem of finding the subgraph induced by a subset of of size with the highest number of edges given that is fixed.
The reduction works as follows: Consider the PLTR problem on , where each undirected edge is replaced with two directed edges and . Let us consider candidates and assume that all nodes initially have null probability of voting for all the candidates but one, different from , that we denote as . Formally we have that, and for each and for each . Assign to each edge a weight , for any fixed constant and .
We show the reduction considering the problem of maximizing the score, because in the instance considered in the reduction the MoV is exactly equal to twice the score. In fact, the score of after PLTR starting from any initial set is
[TABLE]
because and for each . Thus, according to the definition of MoV in Equation (6), we have that
[TABLE]
To compute the expected final score of the target candidate we average its score in all live live-edge graph in , according to Formula (3). In our reduction, the empty live-edge graph is sampled with high probability, i.e., with probability at least :
[TABLE]
where in we used the binomial expansion, is due to last negative term in the lhs that does not appear in the rhs when is even, and is due to
[TABLE]
for any . Since , then . Moreover,
The score obtained by in a live-edge graph starting from any initial set of seed nodes is
[TABLE]
since for each . Also, note that is equal to the number of edges of the subgraph induced by the set of nodes reachable from in , which is not greater than , and thus
Note that in the empty live edge graph the set at the end of LTM is equal to , since the graph has no edges. Thus
[TABLE]
and since the denominator is, again, bounded by two constants we have that
[TABLE]
where is the number of edges of the subgraph induced by , i.e., the value of the objective function of DkS for solution .
Thus, the expected final score of the target candidate is
[TABLE]
Since and are in , then
[TABLE]
for any . Thus
[TABLE]
which means that We apply the Bachmann-Landau definition of notation: There exist three positive constants and such that, for all ,
[TABLE]
Note that, in this case, the constants , , and do not depend on the specific instance.
Since the previous bounds hold for any set we also have that , where OPT is the value of an optimal solution for PLTR and is the value of an optimal solution for DkS.
Suppose there exists an -approximation algorithm for PLTR, i.e., an algorithm that finds a set s.t. the value of its solution is . Then,
[TABLE]
Thus , i.e., it is an -approximation to DkS. ∎
As a corollary of Theorem 1 we get the conditional hardness of approximation bounds stated at the beginning of this section.
Theorem 2**.**
Election control in R-PLTR is -hard.
Proof.
We prove the hardness by reduction from Influence Maximization under LTM, which is known to be -hard [KKT15].
Consider an instance of Influence Maximization under LTM. is defined by a weighted graph with weight function and by a budget . Let be the instance that corresponds to on R-PLTR, defined by the same budget and by a graph that can be built as follows:
Duplicate each vertex in the graph, i.e., we define the new set of nodes as . 2. 2.
Add an edge between each vertex to its copy in , i.e., we define the new set of edges as . 3. 3.
Keep the same weight for each edge in and we set the weights of all new edges to , i.e., for each and for each . Note that the constraint on incoming weights required by LTM is not violated by . 4. 4.
Consider candidates . For each we set and for any other candidate . For each we set , and for any other candidate .
Let be the initial set of seed nodes of size that maximizes and let be the set of active nodes at the end of the process. The value of the MoV obtained by in is . Indeed, each node in has , because the probability of voting for the target candidate remains the same after the normalization. Moreover, each node influences its duplicate with probability 1 and therefore . Therefore, , , and .
Let be the initial set of seed nodes of size that achieves the maximum in . Without loss of generality, we can assume that , since we can replace any seed node in with its corresponding node in without decreasing the objective function. If is the set of active nodes at the end of the process, then by using similar arguments as before, we can prove that . Let us assume that does not maximize , then, would also not maximize , which is a contradiction since is an optimal solution for .
We can prove the -hardness for the case of maximizing the score by using the same arguments. In fact, notice that maximizing the score of , i.e., , is exactly equivalent to maximize the cardinality of the active nodes in LTM. ∎
4 Approximation Results
In this section, we first show that we can approximate the optimal MoV to within a constant factor by optimizing the increment in the score of . In detail we show that, given two solutions and such that and are maximum, then . Indeed, we show a more general statement that is: If a solution approximates within a factor , then .
Then we show that a simple greedy hill-climbing approach (Algorithm 1) gives a constant factor approximation to the problem of maximizing , where the constant is . By combining the two results, we get a -approximation algorithm for the election control problem in R-PLTR.
The next theorem generalizes [WV18, Theorem 5.2] as it holds for any scoring rule and for any model in which we have the ability to change only the position of in the lists of a subset of voters and the increment in score of is at least equal to the decrement in scoring of the other candidates.
Theorem 3**.**
An -approximation algorithm for maximizing the increment in score of a target candidate gives an -approximation to the election control problem.
Proof.
Let us consider two solutions and for the problem of maximizing the MoV for candidate , with as the optimal solution to this problem. These solutions arbitrarily select a subset of voters and modify their preference list changing the score of . Let us fix and , respectively, as the candidates different from with the highest score before and after the solution is applied. Assume there exists an -approximation to the problem of maximizing the increment in score of the target candidate; if we do not consider the gain given by the score lost by the most voted opponent, we have that
[TABLE]
where the last inequality holds because for any solution and candidate since modifies only the score of , increasing it, while the score of all the other candidates is decreased, and the increment in score to is equal to the sum of the decrement in score of all the other candidates. Since , we have that
[TABLE]
where is the candidate with the highest score after the solution is applied. By definition of we have that , which implies that
[TABLE]
Thus, and we conclude that . ∎
Constructive Election Control in R-PLTR.
Next theorem shows how to get a constant factor approximation to the problem of maximizing the MoV in R-PLTR by reducing the problem to an instance of the weighted version of the influence maximization problem with LTM [KKT15].
This extension of the LTM, associates to each node a non-negative weight () that captures the importance of activating that node. The goal is to find the initial seed set in order to maximize the sum of the weights of the active nodes at the end of the process, i.e., finding where is a weight function over the node set.
A simple hill-climbing greedy algorithm achieves a -approximation if the weights are polynomial (or polynomially small) in the number of nodes of the graph and the number of live-edge graph samples is polynomially large in the weights [KKT15].333It is still an open question how well the value of can be approximated for an influence model with arbitrary node weights. Intuitively, if a node has an exponentially small probability of being sampled in the live-edge graph associated with a high weight, then a polynomial number of samples would not be enough to consider it in the solution with non-negligible probability. We exploit this result to approximate the MoV via Algorithm 1, reducing the problem of maximizing the score to that of maximizing in the weighted LTM. We define a new graph with the same sets of nodes and edges of . Then, we assign a weight to each node equal to . Note that we are able to correctly approximate the value of using such weights since by hypothesis on the model , for each and for some constant , and since , for each for some constant . By applying a multiplicative form of the Chernoff bound we can get a approximation of , with high probability [KKT15, Proposition 4.1].
Thus, we can use Algorithm 1 to maximize the influence on . The algorithm starts with an empty set and adds to it, in each of rounds, the node with maximal marginal gain w.r.t. the solution computed so far.
Theorem 4**.**
Algorithm 1 guarantees a -approximation factor to constructive election control in R-PLTR.
Proof.
We first prove that Algorithm 1 gives an -approximation to the problem of maximizing the increment in score of the target candidate in R-PLTR. Let and respectively be the set of initial seed nodes found by the greedy algorithm and the optimal one. We have that
[TABLE]
and, since the denominator is at most 2, that
[TABLE]
where is the set of active nodes at the end of the process.
Note that is exactly the objective function that the greedy algorithm maximizes. Hence, using the result by Kempe et al. [KKT15] we know that
[TABLE]
where is the set of active nodes at the end of the process starting from .
Therefore since
[TABLE]
where the inequality holds since all the denominators in are at least 1. Thus, Algorithm 1 achieves a -approximation to the maximum increment in score. Using Theorem 3 we get a -approximation for the MoV. ∎
Destructive Election Control in R-PLTR.
The destructive election control problem is similar to the constructive problem, but in this scenario, in our models, the probability that a voter votes for decreases depending on the amount of influence received by and the loss of probability of is evenly split over all the other candidates. In this way, we avoid negative values and values that do not sum to 1. In detail, if is the set of active nodes at the end of LTM, then, for each , the preference list changes as follows:
[TABLE]
for each . We define , i.e., what we want to maximize, as
[TABLE]
where is the initial set of seed nodes and is the sum of constant terms that are not modified by the process. Note that maximizing is -hard (it can be proved with a similar argument to that of Theorem 2).
Similarly to the constructive case, we define a new graph with the same sets of nodes and edges of . Then, we assign a weight to each node equal to and we run Algorithm 1 to find a seed set that approximates the maximum expected weight of active nodes.
Theorem 5**.**
Algorithm 1 guarantees a -approximation factor to the destructive election control in R-PLTR.
Proof.
We first prove that Algorithm 1 achieves an approximation factor to the problem of maximizing the decrease in score of the target candidate in R-PLTR. Let and respectively be the set of initial seed nodes found by the greedy algorithm and the optimal one. Let be the decrease in score of candidate with solution , i.e., . Let be the set of active nodes at the end of the process; then we have that
[TABLE]
and, since the denominator is at most 2, that
[TABLE]
Note that is exactly the objective function of the greedy Algorithm that maximizes the weighted-LTM for . Hence, using the result by Kempe et al. [KKT15], we know that
[TABLE]
where is the optimal set of active nodes, i.e., the set of active nodes at the end process starting from ( the optimal solution for the weighted-LTM).
Therefore
[TABLE]
because
[TABLE]
where the inequality is due to the fact that the denominator in all the terms of is at least 1. Thus we achieve a -approximation to the maximum increment in score.
Let us fix and , respectively, as the candidates different from with the highest score before and after the solution is applied; let be the most voted opponent after the optimal solution is applied. Then we have that
[TABLE]
where the last inequality holds since, by definition of and , we have that ∎
5 Simulations
We simulate our model on two real-world social networks444The datasets are taken from http://networkrepository.com/ on which political campaigning messages could spread:
- •
polbooks: an undirected network with 105 nodes and 882 edges where nodes are political books and edges represent co-purchasing behavior; nodes are labeled as “liberal,” “conservative,” or “neutral.”
- •
polblogs: a directed network with 1,224 nodes and 19,025 edges where nodes are web blogs about US politics and edges hyperlinks connecting them; nodes are labeled as “liberal” or “conservative.”
The number of candidates in our simulations is based on the ground truth of the datasets; as mentioned earlier, polbooks has three clusters and polblogs has two clusters based on different US political parties. We set the probability of each node to vote for, say, a “liberal” candidate proportionally to the number of neighbors labeled as “liberal,” i.e., we set where is the “liberal” candidate, is the set of nodes labeled as “liberal,” and is the set of neighbors of . For each node we sampled the “non-incoming influence weight” uniformly at random in and assigned the remaining influence weight uniformly among its incoming neighbors, i.e., we assigned to each edge a weight .
In our simulations, we run GreedyScore (Algorithm 1) for the election control problem in R-PLTR. Then, we measure the score and the MoV of each candidate using as starting seed nodes the ones found by the algorithm both in PLTR and in R-PLTR. We run the simulation considering each different candidate as the target one to cover multiple scenarios, considering as budget values the ones in . Then, as baseline to compare, we also considered as seed nodes the most influential ones, i.e., the nodes selected by GreedyIM, the classical greedy algorithm for Influence Maximization [KKT15].
For the implementation, we used .Net framework 4.6.2 and C# programming language. We have implemented five different classes for managing the graph, the LTM process, the PLTR process, and a GUI. We execute the simulations on a system with the following specifications: CPU Intel Core i7-6700HQ 2.6 GHz, with KB 8-way L1 (data and inst) cache, and KB 4-way L2 cache, and MB 12-way L3 cache, RAM 16G DDR4. Each simulation has a running time of approximately 40 seconds for poolbooks and 140 minutes for polblogs.
The results relative to the scores are shown in Figures 1 and 2. As expected, the effect of our algorithm in R-PLTR is amplified compared to PLTR, since it affects a greater number of voters. Taking as example the “liberal” candidate in polbooks, we need a budget to make it overtake the “conservative” candidate in PLTR, while a budget is enough in R-PLTR (Figure 1); in polblogs, instead, we are not able to make the “liberal” candidate win in PLTR with budget , but it is enough a budget to make it overtake the “conservative” candidate in R-PLTR (Figure 2).
The results relative to MoV are presented in Figure 4. We can note that, as a general trend, candidates with lower probability of winning, are the most affected by the influence generated by the seed nodes selected by our algorithm both in PLTR and R-PLTR. The “neutral” and “liberal” candidates, respectively last and second last voted, have the higher MoV in polbooks (see Figure 4, on the left), while the “liberal” candidate, which was losing the elections, has the higher MoV in polblogs (see Figure 4, on the right).
Finally, in Figure 4 we present the difference between the MoV calculated by GreedyScore and GreedyIM. The simulations show that our algorithm outperforms GreedyIM, as expected. The only scenario in which our algorithm performs worse is that in which we influence, with low budget, the already winning candidate (see Figure 4, on the left, red lines). The reason why GreedyScore works better than GreedyIM is that it looks for seeds that will influence “critical” voters, i.e., voters on which the influence will have more impact on the global score of the candidates, while GreedyIM just looks for influential voters, independently from their initial opinion.
6 Conclusions and Future Work
Influencing elections by means of social networks is a significant issue in modern society, and understanding this phenomenon is of crucial importance in order to prevent the integrity of democracy. Our results constitute the first step towards realistic modeling of the use of social influence to control elections as our models take into account that voters might hide their preferences to a controller. In one of our models the election control problem cannot be approximated within any reasonable bound, under some computational complexity hypothesis. For the other model we provide an approximation algorithm that guarantees a constant factor approximation ratio. The results in this paper open several research directions. We plan to study the election control problem in a variant of PLTR where multiple campaigns affect voters’ opinions on different candidates. It is also worth to investigate models with uncertainty in other voting systems. Finally, it would be interesting to consider uncertainty models also for the diffusion process, e.g., in robust influence maximization only a probability distribution on the edge’s weights is known.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[AD 20] Mohammad Abouei Mehrizi and Gianlorenzo D’Angelo. Multi-winner election control via social influence. In Proc. of SIROCCO 2020 , 2020. To appear.
- 2[AG 17] Hunt Allcott and Matthew Gentzkow. Social media and fake news in the 2016 election. J. Economic Perspectives , 31(2):211–36, 2017.
- 3[AM 14] Krzysztof R. Apt and Evangelos Markakis. Social networks with competing products. Fundam. Inform. , 129(3):225–250, 2014.
- 4[BE 17] Robert Bredereck and Edith Elkind. Manipulating opinion diffusion in social networks. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, August 19-25, 2017 , pages 894–900, 2017.
- 5[BFJ + 12] Robert M. Bond, Christopher J. Fariss, Jason J. Jones, Adam D. I. Kramer, Cameron Marlow, Jaime E. Settle, and James H. Fowler. A 61-million-person experiment in social influence and political mobilization. Nat. , 489(7415):295–298, 2012.
- 6[CCDP 19a] Federico Corò, Emilio Cruciani, Gianlorenzo D’Angelo, and Stefano Ponziani. Exploiting social influence to control elections based on scoring rules. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019 , pages 201–207, 2019.
- 7[CCDP 19b] Federico Corò, Emilio Cruciani, Gianlorenzo D’Angelo, and Stefano Ponziani. Vote for me!: Election control via social influence in arbitrary scoring rule voting systems. In Proceedings of the 18th International Conference on Autonomous Agents and Multi Agent Systems, AAMAS ’19, Montreal, QC, Canada, May 13-17, 2019 , pages 1895–1897, 2019.
- 8[Fer 17] Emilio Ferrara. Disinformation and social bot operations in the run up to the 2017 french presidential election. First Monday , 22(8), 2017.
