Comparing Election Methods Where Each Voter Ranks Only Few Candidates
Matthias Bentert, Piotr Skowron

TL;DR
This paper investigates how well election rules like positional scoring and Minimax can be approximated using partial preferences collected through randomized or deterministic methods, providing theoretical bounds and simulation results.
Contribution
It introduces bounds on approximation ratios for election rules based on partial preferences and compares randomized versus deterministic collection methods.
Findings
Randomized approach generally yields better approximations.
Borda rule approximations are effective with just two candidates per voter.
Minimax rule requires larger candidate subsets for good approximation.
Abstract
Election rules are formal processes that aggregate voters preferences, typically to select a single candidate, called the winner. Most of the election rules studied in the literature require the voters to rank the candidates from the most to the least preferred one. This method of eliciting preferences is impractical when the number of candidates to be ranked is large. We ask how well certain election rules (focusing on positional scoring rules and the Minimax rule) can be approximated from partial preferences collected through one of the following procedures: (i) randomized-we ask each voter to rank a random subset of candidates, and (ii) deterministic-we ask each voter to provide a ranking of her most preferred candidates (the -truncated ballot). We establish theoretical bounds on the approximation ratios and we complement our theoretical analysis with computer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Comparing Election Methods Where Each Voter Ranks Only Few Candidates
Matthias Bentert
TU Berlin
Berlin, Germany
Piotr Skowron
University of Warsaw
Warsaw, Poland
Abstract
Election rules are formal processes that aggregate voters preferences, typically to select a single candidate, called the winner. Most of the election rules studied in the literature require the voters to rank the candidates from the most to the least preferred one. This method of eliciting preferences is impractical when the number of candidates to be ranked is large. We ask how well certain election rules (focusing on positional scoring rules and the Minimax rule) can be approximated from partial preferences collected through one of the following procedures: (i) randomized—we ask each voter to rank a random subset of candidates, and (ii) deterministic—we ask each voter to provide a ranking of her most preferred candidates (the -truncated ballot). We establish theoretical bounds on the approximation ratios, and we complement our theoretical analysis with computer simulations. We find that mostly (apart from the cases when the preferences have no or very little structure) it is better to use the randomized approach. While we obtain fairly good approximation guarantees for the Borda rule already for , for approximating the Minimax rule one needs to ask each voter to compare a larger set of candidates in order to obtain good guarantees.
1 Introduction
An election rule is a function that takes as input a collection of voters preferences over a given set of candidates and returns a single candidate, called the winner. There is a large variety of election rules known in the literature (we refer the reader to the survey by Zwicker [Zwi15] for an overview); most of them require the voters to provide strict linear orders over the candidates. Yet, it is often hard, or even infeasible for a voter to provide such a prefernce ranking, especially when the set of candidates is large. Indeed, it is often believed that a voter can rank at most five to nine candidates [Mil56].
In this paper we ask how the quality of decisions made through voting depends on the amount of information available. Specifically, our goal is to assess the quality of outcomes of elections when each voter can be asked to rank at most candidates. We compare two ways of eliciting preferences. In the first approach—which we call randomized—we ask each voter to rank a random subset of candidates. In the second approach—which we call deterministic—we ask each voter to provide the ranking of her top most preferred candidates (the, so-called, -truncated ballot). For a number of rules (we analyze positional scoring rules and the Minimax method), we investigate how well they can be approximated by algorithms that use one of the two elicitation methods.
Our Contribution
Our contribution is the following:
In Section 3.1 we identify a class of positional scoring rules that, for a given , can be well approximated using the randomized approach. consists of a single rule, namely the Borda count; the number of rules in grows exponentially with . We theoretically prove approximation guarantees for the rules from —these guarantees are more likely to be accurate when the number of voters is large—we analytically show how, in the worst case, the approximation guarantees depend on the number of voters. In Section 3.2 we provide an analogous analytical analysis for the Minimax rule. 2. 2.
In Section 4 we prove upper-bounds on the approximation ratios of an algorithm that uses -truncated ballots; we prove these bounds both for positional scoring rules and for the Minimax rule. In both cases, we show that the algorithm that minimizes the maximal regret of Lu and Boutilier [LB11] (we recall this algorithm in Section 4.1) matches our upper-bounds (for Minimax our analysis is tight up to a small constant factor). 3. 3.
We ran computer simulations in order to verify how the approximation ratio depends on the particular distribution of voters preferences (Section 5). Our experiments confirm that in most cases (with the exception of very unstructured preferences) the randomized approach is superior. We also show that usually only a couple of hundreds of voters are required to achieve a reasonably good approximation.
Related Work
Our work contributes to the broad literature on handling incomplete information in voting—for a survey on this topic, we refer the reader to the book chapter by Boutilier and Rosenschein [BR15]. Specifically, our research is closely related to the idea of minimizing the maximal regret [LB11]. Therein, for a partial preference profile , the goal is to select a candidate such that the score of in the worst possible completion of is maximized. In particular, algorithms minimizing the maximal regret yield the best possible approximation ratio. Our paper complements this literature by (1) providing an accurate analysis of these approximation ratios for various methods (which allows to better judge suitability of different methods for handling incomplete information), and (2) by providing the analysis for two natural methods of preference elicitation (which also allows to assess which of the two methods is better).
Algorithms for minimizing the maximal regret interpret the missing information in the most pessimistic way: they assume the worst-possible completion of partial preferences. Other approaches include assuming the missing pairwise preferences to be distributed uniformly (e.g. Xia and Conitzer [XC11]) and machine-learning techniques (Doucette [Dou14, Dou15]) to “reconstruct” missing information (assuming that the missing pairwise comparisons are distributed similarly as in observed partial rankings).
Our work is also closely related to the literature on distortion [PR06, CP11, BCH*+*15]. There, an underlying utility model is assumed, and the goal is to estimate how well various voting rules that have only access to ordinal preferences, approximate optimal winners, i.e., candidates that maximize the total utility of the voters. The concept of distortion has recently received a lot of attention in the literature. The definition of distortion has for example been adapted to social welfare functions (where the goal is to output a ranking of candidates rather than a single winner) [BPQ19] and to participator budgeting [BNPS17]. Some works also study distortion assuming a certain structure of the underlying utility model (e.g., that it can be represented as a metric space) [ABE*+*18, AP17, FFG16, GKM17, GAX17].
Finally, we mention that our randomized algorithms are similar to the one proposed by Hansen [Han16]. The main difference is that the rule proposed by Hansen asks each voter to compare a certain number of pairs of candidates, while in our approach we ask each voter to rank a certain fixed-size subset of them. Hansen views his algorithm as a fully-fledged standalone rule (and compares it with other election systems, mostly focusing on assessing the probability of selecting the Condorcet winner), while our primary goal is to investigate how well our rules approximate their original counterparts.
2 Preliminaries
An election is a pair , where and denote the sets of voters and candidates, respectively. Each voter is endowed with a preference ranking over the candidates, which is a total ordering of the candidates and which we denote by . For each candidate by we denote the position of in ’s preference ranking. The position of the most preferred candidate is one, of the second most preferred candidate is two, etc. For example, for a voter with the preference , we have , , and .
For an integer we use to denote the set and we use the Iverson bracket notation—for a logical expression the term means if is true and [math] otherwise.
A voting rule is a function that, for a given election , returns a subset of candidates, which we call tied winning candidates. Below we describe several (classes of) voting rules that we will focus on in this paper.
A positional scoring function is a mapping that assigns to each position a real value: intuitively, is a score that a voter assigns to a candidate that she ranks as her -th most preferred one. For each positional scoring function we define the -score of a candidate as , and the corresponding election rule selects the candidate(s) with the highest -score. Examples of common positional scoring rules include:
Borda rule:
Based on a linear decreasing positional scoring function, the Borda rule is formally defined by for .
Plurality rule:
Being equivalent to the -approval rule, the positional scoring function for the Plurality rule assigns a score of one to the first position and zero to all others.
Another important class of voting rules origins from the Condorcet criterion. It says that if there exists a candidate that is preferred to any other candidate by a majority of voters, then the voting rule should select . We focus on one particular rule satisfying the Condorcet criterion (we chose a rule picking the candidates that maximize a certain scoring function so that we could apply to the rule the standard definition of approximation):
Minimax rule.
For an election and two candidates , we define as the number of voters who prefer to and we set
[TABLE]
The rule then selects the candidates with the highest score.
Since all rules described above select the candidates with the maximal scores (with particular rules differing in how the score should be calculated), a natural definition of approximation applies.
Definition 1**.**
We say that is an -approximation algorithm for a rule if for each election instance it holds that:
[TABLE]
where is a function representing the score awards each candidate, is the set of winners returned by , and is the candidate returned .
Later on, we will consider algorithms that have access only to certain parts of the input instances. In such cases the above definition still applies. For example, let denote the truncated instance obtained from , i.e., a partial election which for each voter contains her preferences ranking from , truncated to the top positions. Then we say that is an -approximation algorithm for for -truncated instances, when for each election instance it holds that:
[TABLE]
3 Randomized Approach
In this section we explore a randomized approach, where each voter can be asked to rank a random subset of candidates.
3.1 Scoring Rules
We start our analysis by looking at the class of positional scoring rules. For the sake of simplicity we will assume throughout this section that is divisible by 111We will always implicitly assume that is much larger than , and we will use randomized algorithms only. Thus, if does not divide , then in our algorithms we can add a preliminary step that randomly selects a set of voters, and ignores the remaining ones. We mention that other authors also suggested to give multiple randomized ballots to each voter.. We first present an algorithm that estimates the score of each candidate and picks the candidate with the highest score. The algorithm is parameterized with a natural number and a vector of reals —for a fixed vector we will call the algorithm -PSF-ALG. This algorithm asks each voter to rank a random set of candidates. We say that a candidate is ranked by a voter if belongs to the set of candidates that was asked to rank. If is the -th most preferred among the candidates ranked by a voter, then receives the score of from the voter. Such scores are summed up for each candidate, normalized by the number of voters who ranked the respective candidate, and the candidate with the highest total score is declared the winner. Pseudcode of the algorithm is given in Algorithm 1.
Below, we will show that for some positional scoring rules, by choosing the vector carefully, we can find good approximations of winning candidates with high probability. First, through Theorem 1 we establish a relation between positional scoring functions and vectors that should be used to assess ; the formula is not intuitive, and we will discuss it later on. In particular, we will explain which positional scoring functions can be well approximated using this approach, that is, we will discuss the structure of the class of positional scoring functions which are covered by the following theorem.
Theorem 1**.**
Fix a non-increasing sequence of reals and consider the positional scoring function defined by
[TABLE]
For a candidate that is ranked by at least one voter, we denote by the random variable describing the total normalized score that was assigned by -PSF-ALG. Then, the expected value is equal to the -score of , and the probability that the score computed by -PSF-ALG for differs from its expected value by a multiplicative factor of is upper-bounded by , i.e.,
[TABLE]
Proof.
Let us fix a candidate who is ranked by at least one voter. The process of computing the score of according to Algorithm 1 can be equivalently described as follows. We first decide on the number of voters we ask to rank . Second, we pick uniformly at random a set of voters such that all voters in are asked to rank and all voters in are not asked to rank . Finally, we ask each voter from to rank and a randomly selected set of candidates. Let be a random variable describing the number of voters who rank . Further, for each voter , let denote the random variable equal 1 if is the -th candidate among those ranked by voter and zero otherwise. In particular, is zero when is not asked to rank . Observe that if , then the value of can be expressed as
[TABLE]
Further, let be 1 if each voter from ranks and 0 otherwise. Similarly, let be 1 if ranks and 0 otherwise. Let be equal to 1 if is ranked as the -th most preferred candidate among by and 0 otherwise. We next compute the conditional expected value . We first give the formal equalities and give reasoning for the more complicated ones afterwards.
[TABLE]
We will now explain some of the equalities in the above sequence. (3) is an effect of regrouping the summands; each summand in the previous line is added for each set of size which includes —there are such sets and is the same as . (5) holds for the following reason: A voter who ranked was asked to rank some set of candidates including . Each possible set has the same probability of being selected, thus this probability is . (6) is true as we will show that . Consider a fixed voter , a fixed candidate , and a set such that
(i) ,
(ii) , and
(iii) considers to be her -th most preferred candidate from .
Each such a set must consist of candidates that are ranked before by and candidates that are ranked after . Thus, there are such sets. We refer to Fig. 1 for an illustration.
Next, we will use the Chernoff’s inequality to assess the probability that the computed score of a candidate does not differ from its true score by a factor of . We will first assess the conditional probability {{{\mathrm{P}}}}\Big{(}\left|X_{c}-{{{\mathrm{E}}}}(X_{c})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c})|N_{c}=x\Big{)}. Observe that the conditional variables are not independent. For instance, if , then for each . However, they are all negatively correlated—intuitively meaning that if a variable becomes 1 (resp., 0), then the other variables are less likely to become 1 (resp., 0). Thus, we can still apply the Chernoff’s bound [AD11, Theorem 1.16, Corollary 1.10] which states that for any negatively-correlated random variables such that and any it holds that
[TABLE]
It follows immediately that .
Now, consider the variables . These variables are from and from Eqs. 1, 2, 3, 4, 5, 6 and 7 we get that
[TABLE]
This yields
[TABLE]
Finally, using the binomial identity we get
[TABLE]
This concludes the proof. ∎
Now, let us discuss the form of positional scoring functions used in the statement of Theorem 1. First, observe that for , if we set and we have that . This means that by asking each voter to rank only two candidates, we can correctly (in expectation) assess the Borda scores of the candidates.
Corollary 2**.**
For a candidate the expected value of the score computed by Algorithm -SEP-ALG for is the Borda score of .
Unfortunately, not every positional scoring function can be efficiently assessed while asking each voter to rank only few candidates. For example, we can generalize Corollary 2 and show that for any vector of two elements , the algorithm -SEP-ALG can only compute scores that are affine transformations of the Borda scores (thus, for the algorithm can only be used to approximate the Borda rule).
We will now describe the class of all positional scoring functions which can be computed correctly in expectation by our algorithm for any fixed . Since each positional scoring function is based on some -dimensional vector which can be expressed as , where , and so on, these -vectors form a basis of the linear space of positional scoring functions.
Let be the set of all positional scoring functions that can be computed (correctly in expectation) by our algorithm for a fixed . Since it holds for each two -element vectors that we have that is a linear space too.
Thus, is an -dimensional linear subspace of the -dimensional space of all positional scoring functions, and so we can compactly describe it by providing scoring functions forming a basis of . Figure 2 visually illustrates the scoring functions forming a basis for . In other words, for a given value of , we can use Theorem 1 to correctly compute (in expectation) all scoring functions which can be obtained as linear combinations of the scoring functions depicted in Figure 2.
Finally, let us give some intuition regarding the probabilities assessed in Theorem 1. For example, for candidates and voters the Borda score of a winning candidate is at least . Assume that we want to ask each voter to compare only two candidates, and set . When assessing the score of a winning candidate, to get we need about 72 thousands voters. For one million voters, this probability drops below .
Finally, note that Theorem 1 applies to any candidate, not only to election winners. This makes the result slightly more general, since it also applies to e.g., social welfare functions, where the goal is to output a ranking of the candidates instead of a single winner.
3.2 Minimax Rule
We will now investigate whether the Minimax rule can be well approximated when each voter is only asked to rank a few candidates. We will use an algorithm similar to Algorithm 1: each voter ranks a subset of candidates and whenever two candidates are ranked by a voter , we use her preference list to estimate . Notably, we scale the values for each two candidates by the number of times they were compared and use these normalized values to compute the Minimax winner. This algorithm is formalized in Algorithm 2.
Theorem 3**.**
For each candidate the probability that the total normalized score computed by Algorithm 2 for differs from the true Minimax score of by a multiplicative factor of at least is upper-bounded by:
[TABLE]
Proof.
First, let us fix a pair of candidates and let be the random variable describing the value as computed by Algorithm 2. Similarly as in the proof of Theorem 1 we can express as a sum of negatively correlated random variables.
Specifically, computing according to Algorithm 2 can be equivalently described as follows: First, we decide on how many voters will be asked to compare and . Let be the random variable describing this number of voters. Second, assuming , we pick uniformly at random a set of voters, and we ask them to compare and . For each voter , let denote the random variable equal 1 if voter said that she prefers to , and 0 otherwise. In particular, is zero when is not asked to compare and . Observe that if , then , and so the value of can be expressed as
[TABLE]
We next compute the conditional expected value :
[TABLE]
Next, we will use the Chernoff’s inequality to upper-bound the probability that the value of random variable does not differ from its expected value by a factor of . We first look at the conditional probability {{{\mathrm{P}}}}\Big{(}\left|X_{c,c^{\prime}}-{{{\mathrm{E}}}}(X_{c,c^{\prime}})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c,c^{\prime}})|N_{c,c^{\prime}}=x\Big{)}. As in the proof of Theorem 1, we note that the conditional variables are not independent, yet they are all negatively correlated—the fact that one variable becomes 1 (resp., 0) can only decrease the probabilities that some other becomes 1 (resp., 0). Thus, we can still apply the Chernoff’s bound [AD11, Theorem 1.16, Corollary 1.10], which states that for any negatively-correlated random variables such that and any it holds that
[TABLE]
Since, for each , we have , and , we get that:
[TABLE]
Next, we get that:
[TABLE]
Notice that can be represented by the following. First we decide for out of voters to rank and . We then ask these voters to rank out of remaining candidates and ask all other voters to not rank and . This can be modeled by as is the total number of possible sets to ask a voter to rank and as discussed before, there are sets that contain and . Hence
[TABLE]
Again using the binomial identity , we get
[TABLE]
Finally, let . The probability that for candidate the score computed by Algorithm 2 differs from its true Minimax score by a multiplicative factor of at least is upper-bounded by
[TABLE]
Clearly, we have:
[TABLE]
Further, since by definition for each we have , it holds that:
[TABLE]
Thus we get that
[TABLE]
4 Deterministic Approach (-Truncated Elections)
When not asking each voter about each candidate, one always has to decide whether each voter is asked about random candidates or about specific ones. On the one hand, asking about specific positions in preference rankings, allows one to focus on the top ones that seem to contain more relevant information; especially when the goal is to select the winner, who—intuitively—is more likely to appear in top positions. On the other hand, asking voters about random candidates might be more advantageous as the input may contain dependencies between candidates that are not known a priori.
In this section we investigate the case when each voter is asked about her most preferred candidates. We will then describe an algorithm that is guaranteed to approximate the true score at least as good as any other algorithm and analyze its performance for Borda and for Minimax. We will then show a general lower bound on the approximation ratio, that is, we show that no algorithm can approximate the true score in the worst case arbitrarily good and see that it matches the bound for Borda and almost matches the bound we computed for Minimax.
4.1 The Best Approximation Algorithm for -Truncated Elections
Let us start by describing the algorithm that for each -truncated instance gives the best possible approximation guarantee, that is, the best approximation of the true winner in the worst-case full preference profile that induces the given -truncated instance. We mention that the idea of this algorithm is very similar to the one behind the algorithms for minimizing the maximal regret [LB11], yet the analysis of the approximation ratio of the algorithm is new to this paper.
Consider an election and let be the -truncated instance obtained from . Observe that when given and choosing a winner, the worst case occurs if the picked winner is ranked at the very last position by all voters that did not rank this candidate in among the first positions, and the true winner (that our algorithm did not pick) is ranked at position by each voter who did not rank this candidate.
For each candidate we compute two scores: The worst possible score that is guaranteed to get (denoted by )—this score is obtained when is ranked last whenever it is not among the top- positions—and the best possible score that can get (denoted by )—the score obtained by ranking at position whenever it is not ranked among the first positions. Let and be the candidates with the highest , the highest , and the second highest score, respectively. If any candidate is declared winner, then we can guarantee an approximation ratio of , which is clearly maximized by .
If candidate is declared winner, then we can guarantee an approximation ratio of . Thus, an optimal approximation ratio is achieved by an algorithm that computes all the possible scores and then checks whether or guarantees a better result. If it declares the winner and otherwise it picks .
We first show that these two different cases (sometimes choosing and other times choosing guaranteeing a better approximation ratio) can both occur. Afterwards we analyze the guarantees given by each of the two cases and conclude with a family of instances that prove that the obtained results are tight.
Example 1**.**
Consider the following two instances, each with voters , candidates , and :
[TABLE]
Assume our goal is to find the candidate that would best approximate the Borda winner. The scores are presented in the two tables below:
[TABLE]
[TABLE]
.
In both instances is the candidate with the highest score and is (among) the candidates with highest score. In instance , it is best to declare winner as it guarantees an approximation ratio of . In instance on the other hand, it is best to declare winner since it guarantees an approximation ratio of . Notice that the second term in each inequality is the approximation ratio guaranteed by choosing the respective other candidate.
Interestingly, while our algorithm provides the best possible approximation, it can select a candidate that is not a possible winner, i.e., that is not a winner in any profile consistent with the truncated ballot at hand.
Example 2**.**
Consider the following instance with candidates , four voters , and .
[TABLE]
It holds that , , and and therefore declaring winning achieves the best approximation ratio. However, in any election that is consistent with the given truncated election, at least two candidates in get at least 5 points while always gets 4 points. Thus, is not a possible winner.
4.2 Positional Scoring Rules: Approximation Guarantees for -Truncated Elections
In this section we continue our analysis of the algorithm from Section 4.1, focusing on how well it approximates positional scoring rules having only access to -truncated elections. We will now prove guarantees that each of the two rules (choosing the candidate with highest respectively score) provide.
Theorem 4**.**
Let be a positional scoring rule defined by the scoring function . The algorithm from Section 4.1 for -truncated elections gives an approximation guarantee of
[TABLE]
Proof.
Let us first assume that the algorithm picks the candidate with the highest score as a winner. The average score each candidate gets is . Thus, . Let be the candidate, different from that has maximal score. Clearly, . Further, if we fix , then is maximized when candidate is ranked among the top positions (the positions that are counted for ) as few times as possible. This way the number of voters who do not rank is maximized—and these are the voters who can contribute additional score (apart from ) to . That is, if is fixed, then is higher when is ranked high by fewer voters rather than when it is ranked lower but by more voters. Consequently, we can lower bound by:
[TABLE]
Thus, the approximation ratio in this case is at least
[TABLE]
We next turn to the case when the algorithm picks the candidate with maximum score. The average best score of a candidate is given by
[TABLE]
Clearly, . With a fixed the score of is minimized when is ranked by as few voters as possible (the reasoning is similar as in the previous case). If is ranked first by voters, then its score would be . By solving:
[TABLE]
we get that . Thus, gets the score of at least . Consequently, we can lower-bound the approximation ratio by:
[TABLE]
It is easy to verify that for all with , , and it holds that . Substituting into this , , and , we get that
[TABLE]
We will only show that and as everything else is trivial. Observe that
[TABLE]
Since the algorithm always picks the value that results in a higher ratio, we get the thesis. ∎
Theorem 4 gives a very general result that applies to any positional scoring rule. For instance, for -approval we get the approximation ratio of .
Corollary 5**.**
The algorithm from Section 4.1 for -approval with -truncated elections, , gives the approximation guarantee of .
Proof.
We instantiate the expression from Theorem 4 for -approval:
[TABLE]
∎
For the Borda rule, we get the approximation of which on the plot looks similarly to (see the left-hand side plot in Figure 3).
Corollary 6**.**
The algorithm from Section 4.1 for Borda with -truncated elections, gives the approximation guarantee of .
Proof.
The approximation ratio follows from Theorem 4:
[TABLE]
∎
We conclude by providing intuitive explanation of instances that match the bound from Theorem 4. In these instances all candidates get roughly the same score and there are two candidates that also get average score but only appear as few times as possible in the first positions (in the first or second positions only). If candidate is declared the winner by any rule then she gets [math] points from all voters, that did not rank her in the first two positions and gets points from these voters. Otherwise the winning candidate gets [math] points from all voters that did not rank her in the first positions and gets points whenever she is not ranked first or second. Notice that no rule can distinguish between the different instances we just constructed and therefore building the instance after the rule picked a winner is permitted. In either case, the candidate that is declared winner by any rule gets points equal to the average score and the “true winner” or gets points. Observe that this gives exactly the same conditions as for the computation of in the proof of Theorem 4 and hence we have a matching upper bound.
4.3 Minimax Rule
Let us now move to the analysis of the Minimax rule. We start by showing that no deterministic algorithm for Minimax can guarantee a better approximation ratio than .
Theorem 7**.**
There exists no rule for -truncated elections that is a -approximation of the Minimax rule for any .
Proof.
Consider the following -truncated instance of election: Let be the set of candidates and let be the set of voters. Let the truncated preference list of voter be
[TABLE]
Due to symmetry, any candidate can be declared winner by each algorithm. For the sake of simplicity, let us assume that candidate is declared winner. We can then complement this instance by inserting all candidates that were not ranked by some voter in the order suggested by the subscript, that is,
[TABLE]
On the one hand, only voter prefers over and all other voters prefer over . Thus, the Minimax score of is 1. On the other hand, the Minimax score of is . Indeed, the strongest contender to is and voters prefer over (for ) while voters prefer over . Thus, no (deterministic) algorithm can achieve a better approximation ration than . ∎
Note that in the construction in Theorem 7 one can increase the number of voters to be much larger than the number of candidates by simply copying all voters a sufficient number of times.
Theorem 7 already shows that with -trucnated ballots Minimax cannot be well approximated. In particular, the bound for the Minimax rule is much worse than for scoring-based rules. We do not know whether the bound from Theorem 7 is tight. Yet, we can show that a simplified variant of the algorithm from Section 4.1 that computes the maximum score of each candidate and declares the one with the highest score winner, achieves an approximation ratio of . This approximation ratio is lower-bounded by , which means that for reasonably small it (almost) matches the upper bound from Theorem 7 (see the right-hand side plot in Figure 3 for the comparison of these two bounds).
Theorem 8**.**
The algorithm from Section 4.1 approximates the Minimax rule in -truncated elections within a factor of
[TABLE]
Proof.
We will prove our claim by showing that if all candidates have a score of at most , then all candidates have a maximum score of at most .
Assume towards a contradiction that all candidates have a score of at most and there exists a candidate with . We say that for voter and candidates and if and only if is preferred over by voter in the -truncated instance, that is, either and are both among the most preferred candidates of and is preferred over or only is among the first candidates. Since
[TABLE]
and , it follows that for all we have
[TABLE]
Hence for all it holds that
[TABLE]
We now analyze . Let be the number of times a candidate occurs in the truncated instance, that is, the number of voters that rank candidate among the first positions. First, observe that for each candidate it holds that
[TABLE]
as is preferred over any at least that often. Hence,
[TABLE]
Since, by assumption, , it holds that
[TABLE]
or, equivalently,
[TABLE]
Second, notice that if is ranked among the top candidates by at most voters, then there are voters that do not rank among the first positions and by pigeonhole principle there is a candidate with . As discussed above, from Equation 9 it follows that
[TABLE]
which is equivalent to
[TABLE]
Plugging in Equation 10 into this inequality, we get that
[TABLE]
Notice that on the other hand as by pigeonhole principle there is a candidate that is ranked first at least times and hence has a score of at least . Thus, we have reached a contradiction, completing the first part of the proof. We finish the proof by proving
[TABLE]
Observe that
[TABLE]
5 Experimental Evaluation
In Sections 3 and 4 we have assessed the worst-case guarantees of our approximation algorithms. In this section we investigate how these guarantees depend on particular distributions of the the voters’ preferences. We tested the following distributions over preference rankings:
Impartial Culture (IC).
Under the Impartial Culture model each ranking over the candidates is equally probable.
One-dimensional Euclidean Model (1D).
First, we associate each voter and each candidate with a point from the interval —these points are sampled independently and uniformly at random. Then, each voter ranks the candidates according to her distance, preferring the ones which are closer to those which are farther.
Mixture of Mallows’ Models (MMM).
In the Mallows’ model [Mal57] we are given a reference ranking and a real value ; the probability of sampling a ranking is proportional to , where is the number of swaps of adjacent candidates that are required to turn into . We used a mixture of three Mallows’ models: for each of the three models we drawn the reference ranking and the real value uniformly at random. Next, we sampled the parameters that sum up to one; to generate a ranking we first pick one of the three models, we pick the -th model with probability , and we generate the ranking according to the Mallows’ model we picked.
Single Peaked Impartial Culture (SPIC).
In order to generate a profile we first randomly select a reference ranking. Then, we generate rankings that are single-peaked with respect to the reference ranking. Each such single peaked ranking is equally probable. For a definition and discussion on single-peaked preferences we refer the reader to the book chapter by Elkind et al. [ELP17].
For each distributions over preferences and for each approximation algorithm we ran computer simulations as follows: We set the number of candidates to and tested for . We ran simulations for the number of voters ranging from to in steps of . For each combination of values of we ran 500 independent experiments, each time computing the ratio between the score of the candidate returned by algorithm to the score of the optimal candidate. The averages of these ratios (averaged over the aforementioned 500 simulations) and the corresponding standard deviations for the Borda and the Minimax rules are depicted in Figure 4 and Figure 5, respectively.
5.1 Approximation Algorithms for the Borda Rule
We empirically tested how well the two algorithms that we analyzed theoretically in the previous sections approximate the Borda rule. Specifically, we implemented Algorithm 1—which we will refer to as Randomized, and the algorithm described in Section 4.1. We also checked two other deterministic heuristics, that appear simple and intuitive:
The variant of the deterministic algorithm from Section 4.1 that always picks the candidate with the highest score. 2. 2.
An algorithm we call Deter-avg that, for each voter and candidate assigns to the score
- (a)
if , 2. (b)
the average score of the unranked positions , otherwise.
Then, the algorithm picks the candidate with the highest total score.
The three deterministic algorithms were almost indistinguishable in our simulations—Deter-avg was slightly better than the other two. Thus, for readability we present the results only for Deter-avg and Randomized and omit the description of the results for the other two deterministic algorithms. We found the following:
For preferences with no or with little structure, such as those generated by IC and SPIC, the deterministic algorithm gives better results. For preferences with more structure, e.g., those obtained from 1D and MMM models, the randomized algorithm significantly outperforms the deterministic ones. 2. 2.
For each preference distribution that we tested the randomized algorithm gives high quality approximations unless the number of voters is very small. Our results suggest to ask each voter to rank a random subsets of alternatives when the goal is to approximate the Borda rule with limited information from each voter and the number of voters exceeds a couple of hundreds.
5.2 Approximation Algorithms for the Minimax Rule
Similarly to Section 5.1, we empirically tested how well the randomized algorithm (Algorithm 2) and the deterministic algorithm from Section 4.1 approximate the Minimax rule. We refer to the two algorithms as Randomized and Deterministic, respectively. We also tested two other natural heuristics. For each two candidates and , let denote the number of voters who (i) rank and among their most preferred candidates and prefer over or (ii) who rank but not among their top positions. Then:
In our first heuristic algorithm, for each pair of candidates, and , we use a method similar to Minimax, but we replace by . Then, similarly as in the case of the original Minimax rule we compute for each candidate the score and pick the candidate with the maximal score. 2. 2.
In the second heuristic, we set replace by
[TABLE]
In our simulation Deterministic outperformed the two heuristic algorithms we mentioned above, hence we present our results only for Deterministic and Randomized. We observed the following:
The randomized algorithm for the Minimax rule needs to ask each voter to be compare more candidates than in case of Borda to achieve a good approximation. For candidates, asking each voter to compare of them already gave good results for sufficiently many voters. 2. 2.
The deterministic algorithm usually performs better than the randomized one, yet there are distributions (e.g., the one-dimensional Euclidean model) where the quality of winners returned by the deterministic algorithm is much worse than those returned by the randomized algorithm. On the other hand, for each distribution that we tested, the randomized algorithm consistently was giving good results when the number of voters and the number of candidates to ask each voter to rank were sufficiently large.
6 Conclusion
In this paper we theoretically and experimentally analyzed how well certain election rules can be approximated when we are given only parts of voters’ preferences. We compared two methods of eliciting voters’ preferences, (i) the randomized method, where each voter is asked to compare a randomly selected subset of alternatives, and (2) the deterministic method, where we ask each voter to provide a ranking of her most preferred candidates. We investigated how well one can approximate positional scoring rules and the Minimax method through one of these two elicitation methods, providing both upper-bounds on the approximation ratio (impossibility results), and providing algorithms matching these bounds.
We conclude that the randomized approach is usually superior; the exceptions include preference distributions with little or no structure, which rarely appear in practice. For the Borda rule, with hundreds of voters it is usually sufficient to ask each voter to compare two random candidates to achieve a high approximation guarantee. Approximating the Minimax rule is harder: one typically needs more voters and to ask them to compare more candidates—e.g., for candidates, we obtained high approximation guarantees for the Minimax rule only when we set the number of voters to around thousand and .
Acknowledgments
Piotr Skowron was supported by a postdoctoral fellowship of the Alexander von Humboldt Foundation, Germany, and by the Foundation for Polish Science within the Homing programme (Project title: ”Normative Comparison of Multiwinner Election Rules”).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[ABE + 18] E. Anshelevich, O. Bhardwaj, E. Elkind, J. Postl, and P. Skowron. Approximating optimal social choice under metric preferences. Artificial Intelligence , 264:27–51, 2018.
- 2[AD 11] A. Auger and B. Doerr. Theory of Randomized Search Heuristics: Foundations and Recent Developments . World Scientific Publishing, 2011.
- 3[AP 17] E. Anshelevich and J. Postl. Randomized social choice functions under metric preferences. Journal of Artificial Intelligence Research , 58:797–827, 2017.
- 4[BCH + 15] C. Boutilier, I. Caragiannis, S. Haber, T. Lu, A. D. Procaccia, and O. Sheffet. Optimal social choice functions: A utilitarian view. Artificial Intelligence , 227:190–213, 2015.
- 5[BNPS 17] G. Benade, S. Nath, A. Procaccia, and N. Shah:. Preference elicitation for participatory budgeting. In Proceedings of the 31st AAAI Conference on Artificial Intelligence , pages 376–382, 2017.
- 6[BPQ 19] G. Benadé, A. Procaccia, and M. Qiao. Low-distortion social welfare functions. 2019. To appear.
- 7[BR 15] C. Boutilier and J. Rosenschein. Incomplete information and communication in voting. In F. Brandt, V. Conitzer, U. Endriss, J. Lang, and A. D. Procaccia, editors, Handbook of Computational Social Choice , chapter 10. Cambridge University Press, 2015.
- 8[CP 11] I. Caragiannis and A. D. Procaccia. Voting almost maximizes social welfare despite limited communication. Artificial Intelligence , 175(9–10):1655–1671, 2011.
