Comparing Election Methods Where Each Voter Ranks Only Few Candidates

Matthias Bentert; Piotr Skowron

arXiv:1901.10848·cs.DS·January 31, 2019

Comparing Election Methods Where Each Voter Ranks Only Few Candidates

Matthias Bentert, Piotr Skowron

PDF

TL;DR

This paper investigates how well election rules like positional scoring and Minimax can be approximated using partial preferences collected through randomized or deterministic methods, providing theoretical bounds and simulation results.

Contribution

It introduces bounds on approximation ratios for election rules based on partial preferences and compares randomized versus deterministic collection methods.

Findings

01

Randomized approach generally yields better approximations.

02

Borda rule approximations are effective with just two candidates per voter.

03

Minimax rule requires larger candidate subsets for good approximation.

Abstract

Election rules are formal processes that aggregate voters preferences, typically to select a single candidate, called the winner. Most of the election rules studied in the literature require the voters to rank the candidates from the most to the least preferred one. This method of eliciting preferences is impractical when the number of candidates to be ranked is large. We ask how well certain election rules (focusing on positional scoring rules and the Minimax rule) can be approximated from partial preferences collected through one of the following procedures: (i) randomized-we ask each voter to rank a random subset of $ℓ$ candidates, and (ii) deterministic-we ask each voter to provide a ranking of her $ℓ$ most preferred candidates (the $ℓ$ -truncated ballot). We establish theoretical bounds on the approximation ratios and we complement our theoretical analysis with computer…

Equations155

sc_{MM} (c) = c^{'} \neq = c min {sc_{MM} (c, c^{'})} .

sc_{MM} (c) = c^{'} \neq = c min {sc_{MM} (c, c^{'})} .

\frac{score _{R} ( A ( E ))}{max _{w \in R (E)} score _{R} ( w )} \geq α,

\frac{score _{R} ( A ( E ))}{max _{w \in R (E)} score _{R} ( w )} \geq α,

\frac{score _{R} ( A ( trunc ( E )))}{max _{w \in R (E)} score _{R} ( w )} \geq α .

\frac{score _{R} ( A ( trunc ( E )))}{max _{w \in R (E)} score _{R} ( w )} \geq α .

λ_{α} (p) = \frac{1}{( ℓ - 1 m - 1 )} \cdot i = 1 \sum ℓ α_{i} (i - 1 p - 1) \cdot (ℓ - i m - p) .

λ_{α} (p) = \frac{1}{( ℓ - 1 m - 1 )} \cdot i = 1 \sum ℓ α_{i} (i - 1 p - 1) \cdot (ℓ - i m - p) .

\displaystyle p_{\epsilon}={{{\mathrm{P}}}}\Big{(}\left|X_{c}-{{{\mathrm{E}}}}(X_{c})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c})\Big{)}\leq 2\exp\left(-\frac{\epsilon^{2}\ell{{{\mathrm{sc}}}}_{\lambda_{\alpha}}(c)}{6m\alpha_{1}}\right)\text{.}

\displaystyle p_{\epsilon}={{{\mathrm{P}}}}\Big{(}\left|X_{c}-{{{\mathrm{E}}}}(X_{c})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c})\Big{)}\leq 2\exp\left(-\frac{\epsilon^{2}\ell{{{\mathrm{sc}}}}_{\lambda_{\alpha}}(c)}{6m\alpha_{1}}\right)\text{.}

X_{c} = \frac{n}{x} \cdot v \in V \sum i = 1 \sum ℓ α_{i} X_{v, c, i} .

X_{c} = \frac{n}{x} \cdot v \in V \sum i = 1 \sum ℓ α_{i} X_{v, c, i} .

E (X_{c} ∣ N_{c} = x)

E (X_{c} ∣ N_{c} = x)

= \frac{n}{x} \cdot v \in V \sum i = 1 \sum ℓ α_{i} V^{'} \subseteq V ∣ V^{'} ∣ = x v \in V^{'} \sum P (A_{V^{'}, c} = 1∣ N_{c} = x) E (X_{v, c, i} ∣ A_{V^{'}, c} = 1) = \frac{n}{x} \cdot i = 1 \sum ℓ α_{i} v \in V \sum V^{'} \subseteq V ∣ V^{'} ∣ = x \sum \frac{1}{( x n )} [v \in V^{'}] E (X_{v, c, i} ∣ A_{V^{'}, c} = 1) = \frac{n}{x} \cdot i = 1 \sum ℓ α_{i} v \in V \sum V^{'} \subseteq V ∣ V^{'} ∣ = x \sum \frac{1}{( x n )} [v \in V^{'}] P (A_{v, c} = 1∣ A_{V^{'}, c} = 1) E (X_{v, c, i} ∣ A_{v, c} = 1)

= \frac{n}{x} \cdot i = 1 \sum ℓ α_{i} v \in V \sum \frac{1}{( x n )} \cdot (x - 1 n - 1) E (X_{v, c, i} ∣ A_{v, c} = 1)

= i = 1 \sum ℓ α_{i} v \in V \sum E (X_{v, c, i} ∣ A_{v, c} = 1)

= i = 1 \sum ℓ α_{i} v \in V \sum \frac{1}{( ℓ - 1 m - 1 )} S \subseteq C ∣ S ∣ = ℓ \sum [c \in S] \cdot ind_{v} (S, c, i)

= \frac{1}{( ℓ - 1 m - 1 )} v \in V \sum i = 1 \sum ℓ α_{i} (i - 1 pos _{v} ( c ) - 1) \cdot (ℓ - i m - pos _{v} ( c ))

= v \in V \sum λ_{α} (pos_{v} (c)) = sc_{λ_{α}} (c) .

P (X \leq (1 - δ) E (X)) \leq exp (- δ^{2} E (X) /2) and P (X \geq (1 + δ) E (X)) \leq exp (- δ^{2} E (X) /3) .

P (X \leq (1 - δ) E (X)) \leq exp (- δ^{2} E (X) /2) and P (X \geq (1 + δ) E (X)) \leq exp (- δ^{2} E (X) /3) .

E (v \in V \sum i = 1 \sum ℓ \frac{X _{v, c, i} α _{i}}{α _{1}} ∣ N_{c} = x) = \frac{x}{n α _{1}} sc_{λ_{α}} (c) .

E (v \in V \sum i = 1 \sum ℓ \frac{X _{v, c, i} α _{i}}{α _{1}} ∣ N_{c} = x) = \frac{x}{n α _{1}} sc_{λ_{α}} (c) .

\displaystyle{{{\mathrm{P}}}}\Big{(}\left|X_{c}-{{{\mathrm{E}}}}(X_{c})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c})|N_{c}=x\Big{)}

\displaystyle{{{\mathrm{P}}}}\Big{(}\left|X_{c}-{{{\mathrm{E}}}}(X_{c})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c})|N_{c}=x\Big{)}

\displaystyle\qquad={{{\mathrm{P}}}}\Big{(}\left|\frac{x}{n\alpha_{1}}X_{c}-{{{\mathrm{E}}}}\left(\frac{x}{n\alpha_{1}}X_{c}\right)\right|\geq\epsilon{{{\mathrm{E}}}}\left(\frac{x}{n\alpha_{1}}X_{c}\right)|N_{c}=x\Big{)}\overset{\lx@cref{creftype~refnum}{ref:chernoff}}{\leq}2\exp\left(-\frac{\epsilon^{2}x{{{\mathrm{sc}}}}_{\lambda_{\alpha}}(c)}{3n\alpha_{1}}\right).

\displaystyle{{{\mathrm{P}}}}\Big{(}\left|X_{c}-{{{\mathrm{E}}}}(X_{c})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c})\Big{)}

\displaystyle{{{\mathrm{P}}}}\Big{(}\left|X_{c}-{{{\mathrm{E}}}}(X_{c})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c})\Big{)}

\leq x = 0 \sum n (x n) \cdot \frac{( ℓ - 1 m - 1 ) ^{x} ( ( ℓ m ) - ( ℓ - 1 m - 1 ) ) ^{n - x}}{( ℓ m ) ^{n}} \cdot 2 exp (- \frac{ϵ ^{2} x sc _{λ_{α}} ( c )}{3 n α _{1}})

= 2 exp (- \frac{ϵ ^{2} sc _{λ_{α}} ( c )}{3 n α _{1}}) \cdot x = 0 \sum n (x n) \cdot (\frac{ℓ}{m})^{x} (1 - \frac{ℓ}{m})^{n - x} \cdot e^{- x}

= 2 exp (- \frac{ϵ ^{2} sc _{λ_{α}} ( c )}{3 n α _{1}}) \cdot (1 - \frac{ℓ}{m} + \frac{ℓ}{e m})^{n}

\leq 2 exp (- \frac{ϵ ^{2} sc _{λ_{α}} ( c )}{3 n α _{1}}) \cdot (1 - \frac{ℓ}{2 m})^{n}

\leq 2 exp (- \frac{ϵ ^{2} sc _{λ_{α}} ( c )}{3 n α _{1}}) \cdot e^{- \frac{ℓ n}{2 m}} = 2 exp (- \frac{ϵ ^{2} ℓ sc _{λ_{α}} ( c )}{6 m α _{1}}) .

m exp (- \frac{ϵ ^{2} ℓ ^{2} sc _{MM} ( c , c _{m i n} )}{6 m ^{2}})

m exp (- \frac{ϵ ^{2} ℓ ^{2} sc _{MM} ( c , c _{m i n} )}{6 m ^{2}})

X_{c, c^{'}} = \frac{n}{x} \cdot v \in V \sum X_{v, c, c^{'}} .

X_{c, c^{'}} = \frac{n}{x} \cdot v \in V \sum X_{v, c, c^{'}} .

E (X_{c, c^{'}} ∣ N_{c, c^{'}} = x)

E (X_{c, c^{'}} ∣ N_{c, c^{'}} = x)

= \frac{n}{x} \cdot v \in V \sum V^{'} \subseteq V : ∣ V^{'} ∣ = x \sum [v \in V^{'}] \cdot [c ≻_{v} c^{'}]

= \frac{n}{x} \cdot v \in V \sum \frac{( x - 1 n - 1 )}{( x n )} [c ≻_{v} c^{'}]

= v \in V \sum [c ≻_{v} c^{'}] = sc_{MM} (c, c^{'}) .

P (∣ X - E (X) ∣ \geq δ E (X)) \leq 2 exp (- δ^{2} E (X) /3) .

P (∣ X - E (X) ∣ \geq δ E (X)) \leq 2 exp (- δ^{2} E (X) /3) .

\displaystyle{{{\mathrm{P}}}}\Big{(}\left|X_{c,c^{\prime}}-{{{\mathrm{E}}}}(X_{c,c^{\prime}})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c,c^{\prime}})|N_{c,c^{\prime}}=x\Big{)}\leq 2\exp\left(-\frac{x\epsilon^{2}{{{\mathrm{sc}}}}_{{{{\mathrm{MM}}}}}(c,c^{\prime})}{3n}\right)

\displaystyle{{{\mathrm{P}}}}\Big{(}\left|X_{c,c^{\prime}}-{{{\mathrm{E}}}}(X_{c,c^{\prime}})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c,c^{\prime}})|N_{c,c^{\prime}}=x\Big{)}\leq 2\exp\left(-\frac{x\epsilon^{2}{{{\mathrm{sc}}}}_{{{{\mathrm{MM}}}}}(c,c^{\prime})}{3n}\right)

\displaystyle{{{\mathrm{P}}}}\Big{(}\left|X_{c,c^{\prime}}-{{{\mathrm{E}}}}(X_{c,c^{\prime}})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c,c^{\prime}})\Big{)}

\displaystyle{{{\mathrm{P}}}}\Big{(}\left|X_{c,c^{\prime}}-{{{\mathrm{E}}}}(X_{c,c^{\prime}})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c,c^{\prime}})\Big{)}

\displaystyle\qquad=\sum_{x=0}^{n}{{{\mathrm{P}}}}(N_{c,c^{\prime}}=x)\cdot{{{\mathrm{P}}}}\Big{(}\left|X_{c,c^{\prime}}-{{{\mathrm{E}}}}(X_{c,c^{\prime}})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c,c^{\prime}})|N_{c,c^{\prime}}=x\Big{)}

\leq x = 1 \sum n P (N_{c, c^{'}} = x) \cdot 2 exp (- \frac{x ϵ ^{2} sc _{MM} ( c , c ^{'} )}{3 n}) .

P (N_{c, c^{'}} = x) = (x n) \cdot \frac{( ℓ - 2 m - 2 ) ^{x} ( ( ℓ m ) - ( ℓ - 2 m - 2 ) ) ^{n - x}}{( ℓ m )} and thus

P (N_{c, c^{'}} = x) = (x n) \cdot \frac{( ℓ - 2 m - 2 ) ^{x} ( ( ℓ m ) - ( ℓ - 2 m - 2 ) ) ^{n - x}}{( ℓ m )} and thus

x = 1 \sum n P (N_{c, c^{'}} = x) \cdot 2 exp (- \frac{x ϵ ^{2} sc _{MM} ( c , c ^{'} )}{3 n})

= x = 1 \sum n (x n) \cdot \frac{( ℓ - 2 m - 2 ) ^{x} ( ( ℓ m ) - ( ℓ - 2 m - 2 ) ) ^{n - x}}{( ℓ m ) ^{n}} \cdot 2 exp (- \frac{x ϵ ^{2} sc _{MM} ( c , c ^{'} )}{3 n})

= 2 exp (- \frac{ϵ ^{2} sc _{MM} ( c , c ^{'} )}{3 n}) \cdot x = 1 \sum n (x n) \cdot (\frac{ℓ ( ℓ - 1 )}{m ( m - 1 )})^{x} \cdot (1 - \frac{ℓ ( ℓ - 1 )}{m ( m - 1 )})^{n - x} \cdot (\frac{1}{e})^{x} .

2 exp (- \frac{ϵ ^{2} sc _{MM} ( c , c ^{'} )}{3 n}) \cdot x = 1 \sum n (x n) \cdot (\frac{ℓ ( ℓ - 1 )}{m ( m - 1 )})^{x} \cdot (1 - \frac{ℓ ( ℓ - 1 )}{m ( m - 1 )})^{n - x} \cdot (\frac{1}{e})^{x}

2 exp (- \frac{ϵ ^{2} sc _{MM} ( c , c ^{'} )}{3 n}) \cdot x = 1 \sum n (x n) \cdot (\frac{ℓ ( ℓ - 1 )}{m ( m - 1 )})^{x} \cdot (1 - \frac{ℓ ( ℓ - 1 )}{m ( m - 1 )})^{n - x} \cdot (\frac{1}{e})^{x}

= 2 exp (- \frac{ϵ ^{2} sc _{MM} ( c , c ^{'} )}{3 n}) \cdot (1 - \frac{ℓ ( ℓ - 1 )}{m ( m - 1 )} + \frac{ℓ ( ℓ - 1 )}{e m ( m - 1 )})^{n}

= 2 exp (- \frac{ϵ ^{2} sc _{MM} ( c , c ^{'} )}{3 n}) \cdot (1 - \frac{ℓ ( ℓ - 1 ) ( e - 1 )}{m ( m - 1 ) e})^{n}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Comparing Election Methods Where Each Voter Ranks Only Few Candidates

Matthias Bentert

TU Berlin

Berlin, Germany

Piotr Skowron

University of Warsaw

Warsaw, Poland

Abstract

Election rules are formal processes that aggregate voters preferences, typically to select a single candidate, called the winner. Most of the election rules studied in the literature require the voters to rank the candidates from the most to the least preferred one. This method of eliciting preferences is impractical when the number of candidates to be ranked is large. We ask how well certain election rules (focusing on positional scoring rules and the Minimax rule) can be approximated from partial preferences collected through one of the following procedures: (i) randomized—we ask each voter to rank a random subset of $\ell$ candidates, and (ii) deterministic—we ask each voter to provide a ranking of her $\ell$ most preferred candidates (the $\ell$ -truncated ballot). We establish theoretical bounds on the approximation ratios, and we complement our theoretical analysis with computer simulations. We find that mostly (apart from the cases when the preferences have no or very little structure) it is better to use the randomized approach. While we obtain fairly good approximation guarantees for the Borda rule already for $\ell=2$ , for approximating the Minimax rule one needs to ask each voter to compare a larger set of candidates in order to obtain good guarantees.

1 Introduction

An election rule is a function that takes as input a collection of voters preferences over a given set of $m$ candidates and returns a single candidate, called the winner. There is a large variety of election rules known in the literature (we refer the reader to the survey by Zwicker [Zwi15] for an overview); most of them require the voters to provide strict linear orders over the candidates. Yet, it is often hard, or even infeasible for a voter to provide such a prefernce ranking, especially when the set of candidates is large. Indeed, it is often believed that a voter can rank at most five to nine candidates [Mil56].

In this paper we ask how the quality of decisions made through voting depends on the amount of information available. Specifically, our goal is to assess the quality of outcomes of elections when each voter can be asked to rank at most $\ell<m$ candidates. We compare two ways of eliciting preferences. In the first approach—which we call randomized—we ask each voter to rank a random subset of $\ell$ candidates. In the second approach—which we call deterministic—we ask each voter to provide the ranking of her top $\ell$ most preferred candidates (the, so-called, $\ell$ -truncated ballot). For a number of rules (we analyze positional scoring rules and the Minimax method), we investigate how well they can be approximated by algorithms that use one of the two elicitation methods.

Our Contribution

Our contribution is the following:

In Section 3.1 we identify a class $\mathit{Sep}_{\ell}$ of positional scoring rules that, for a given $\ell$ , can be well approximated using the randomized approach. $\mathit{Sep}_{2}$ consists of a single rule, namely the Borda count; the number of rules in $\mathit{Sep}_{\ell}$ grows exponentially with $\ell$ . We theoretically prove approximation guarantees for the rules from $\mathit{Sep}_{\ell}$ —these guarantees are more likely to be accurate when the number of voters is large—we analytically show how, in the worst case, the approximation guarantees depend on the number of voters. In Section 3.2 we provide an analogous analytical analysis for the Minimax rule. 2. 2.

In Section 4 we prove upper-bounds on the approximation ratios of an algorithm that uses $\ell$ -truncated ballots; we prove these bounds both for positional scoring rules and for the Minimax rule. In both cases, we show that the algorithm that minimizes the maximal regret of Lu and Boutilier [LB11] (we recall this algorithm in Section 4.1) matches our upper-bounds (for Minimax our analysis is tight up to a small constant factor). 3. 3.

We ran computer simulations in order to verify how the approximation ratio depends on the particular distribution of voters preferences (Section 5). Our experiments confirm that in most cases (with the exception of very unstructured preferences) the randomized approach is superior. We also show that usually only a couple of hundreds of voters are required to achieve a reasonably good approximation.

Related Work

Our work contributes to the broad literature on handling incomplete information in voting—for a survey on this topic, we refer the reader to the book chapter by Boutilier and Rosenschein [BR15]. Specifically, our research is closely related to the idea of minimizing the maximal regret [LB11]. Therein, for a partial preference profile $P$ , the goal is to select a candidate $c$ such that the score of $c$ in the worst possible completion of $P$ is maximized. In particular, algorithms minimizing the maximal regret yield the best possible approximation ratio. Our paper complements this literature by (1) providing an accurate analysis of these approximation ratios for various methods (which allows to better judge suitability of different methods for handling incomplete information), and (2) by providing the analysis for two natural methods of preference elicitation (which also allows to assess which of the two methods is better).

Algorithms for minimizing the maximal regret interpret the missing information in the most pessimistic way: they assume the worst-possible completion of partial preferences. Other approaches include assuming the missing pairwise preferences to be distributed uniformly (e.g. Xia and Conitzer [XC11]) and machine-learning techniques (Doucette [Dou14, Dou15]) to “reconstruct” missing information (assuming that the missing pairwise comparisons are distributed similarly as in observed partial rankings).

Our work is also closely related to the literature on distortion [PR06, CP11, BCH*+*15]. There, an underlying utility model is assumed, and the goal is to estimate how well various voting rules that have only access to ordinal preferences, approximate optimal winners, i.e., candidates that maximize the total utility of the voters. The concept of distortion has recently received a lot of attention in the literature. The definition of distortion has for example been adapted to social welfare functions (where the goal is to output a ranking of candidates rather than a single winner) [BPQ19] and to participator budgeting [BNPS17]. Some works also study distortion assuming a certain structure of the underlying utility model (e.g., that it can be represented as a metric space) [ABE*+*18, AP17, FFG16, GKM17, GAX17].

Finally, we mention that our randomized algorithms are similar to the one proposed by Hansen [Han16]. The main difference is that the rule proposed by Hansen asks each voter to compare a certain number of pairs of candidates, while in our approach we ask each voter to rank a certain fixed-size subset of them. Hansen views his algorithm as a fully-fledged standalone rule (and compares it with other election systems, mostly focusing on assessing the probability of selecting the Condorcet winner), while our primary goal is to investigate how well our rules approximate their original counterparts.

2 Preliminaries

An election is a pair $E=(V,C)$ , where $V=\{v_{1},v_{2},\ldots,v_{n}\}$ and $C=\{c_{1},c_{2},\ldots,c_{m}\}$ denote the sets of $n$ voters and $m$ candidates, respectively. Each voter $v_{i}$ is endowed with a preference ranking over the candidates, which is a total ordering of the candidates and which we denote by $\succ_{i}$ . For each candidate $c\in C$ by ${{{\mathrm{pos}}}}_{i}(c)$ we denote the position of $c$ in $v_{i}$ ’s preference ranking. The position of the most preferred candidate is one, of the second most preferred candidate is two, etc. For example, for a voter $v_{i}$ with the preference $c_{2}\succ_{i}c_{3}\succ_{i}c_{1}$ , we have ${{{\mathrm{pos}}}}_{i}(c_{1})=3$ , ${{{\mathrm{pos}}}}_{i}(c_{2})=1$ , and ${{{\mathrm{pos}}}}_{i}(c_{3})=2$ .

For an integer $t$ we use $[t]$ to denote the set $\{1,2,\ldots,t\}$ and we use the Iverson bracket notation—for a logical expression $P$ the term $[P]$ means $1$ if $P$ is true and [math] otherwise.

A voting rule is a function that, for a given election $E$ , returns a subset of candidates, which we call tied winning candidates. Below we describe several (classes of) voting rules that we will focus on in this paper.

A positional scoring function is a mapping $\lambda\colon[m]\to{{\mathbb{R}}}$ that assigns to each position a real value: intuitively, $\lambda(p)$ is a score that a voter assigns to a candidate that she ranks as her $p$ -th most preferred one. For each positional scoring function $\lambda$ we define the $\lambda$ -score of a candidate $c$ as ${{{\mathrm{sc}}}}_{\lambda}(c)=\sum_{v_{i}\in V}\lambda({{{\mathrm{pos}}}}_{i}(c))$ , and the corresponding election rule selects the candidate(s) with the highest $\lambda$ -score. Examples of common positional scoring rules include:

Borda rule:

Based on a linear decreasing positional scoring function, the Borda rule is formally defined by $\beta(p)=m-p$ for $p\in[m]$ .

Plurality rule:

Being equivalent to the $1$ -approval rule, the positional scoring function for the Plurality rule assigns a score of one to the first position and zero to all others.

Another important class of voting rules origins from the Condorcet criterion. It says that if there exists a candidate $c$ that is preferred to any other candidate by a majority of voters, then the voting rule should select $c$ . We focus on one particular rule satisfying the Condorcet criterion (we chose a rule picking the candidates that maximize a certain scoring function so that we could apply to the rule the standard definition of approximation):

Minimax rule.

For an election $E=(V,C)$ and two candidates $c,c^{\prime}\in C$ , we define ${{{\mathrm{sc}}}}_{{{{\mathrm{MM}}}}}(c,c^{\prime})=|\{v_{i}\in V\mid c\succ_{i}c^{\prime}\}|$ as the number of voters who prefer $c$ to $c^{\prime}$ and we set

[TABLE]

The rule then selects the candidates with the highest ${{{\mathrm{sc}}}}_{{{{\mathrm{MM}}}}}$ score.

Since all rules described above select the candidates with the maximal scores (with particular rules differing in how the score should be calculated), a natural definition of approximation applies.

Definition 1.

We say that $\mathcal{A}$ is an $\alpha$ -approximation algorithm for a rule $\mathcal{R}$ if for each election instance $E$ it holds that:

[TABLE]

where ${{\mathrm{score}}}_{\mathcal{R}}$ is a function representing the score $\mathcal{R}$ awards each candidate, $\mathcal{R}(E)$ is the set of winners returned by $\mathcal{R}$ , and $\mathcal{A}(E)$ is the candidate returned $\mathcal{A}$ .

Later on, we will consider algorithms that have access only to certain parts of the input instances. In such cases the above definition still applies. For example, let ${{{\mathrm{trunc}}}}(E,\ell)$ denote the $\ell$ truncated instance obtained from $E$ , i.e., a partial election which for each voter contains her preferences ranking from $E$ , truncated to the top $\ell$ positions. Then we say that $\mathcal{A}$ is an $\alpha$ -approximation algorithm for $\mathcal{R}$ for $\ell$ -truncated instances, when for each election instance $E$ it holds that:

[TABLE]

3 Randomized Approach

In this section we explore a randomized approach, where each voter can be asked to rank a random subset of candidates.

3.1 Scoring Rules

We start our analysis by looking at the class of positional scoring rules. For the sake of simplicity we will assume throughout this section that $n$ is divisible by $m$ 111We will always implicitly assume that $n$ is much larger than $m$ , and we will use randomized algorithms only. Thus, if $m$ does not divide $n$ , then in our algorithms we can add a preliminary step that randomly selects a set of $n^{\prime}=\left\lfloor\frac{n}{m}\right\rfloor\cdot m$ voters, and ignores the remaining $n-n^{\prime}$ ones. We mention that other authors also suggested to give multiple randomized ballots to each voter.. We first present an algorithm that estimates the score of each candidate and picks the candidate with the highest score. The algorithm is parameterized with a natural number $\ell\leq m$ and a vector of $\ell$ reals $\alpha=(\alpha_{1},\ldots,\alpha_{\ell})$ —for a fixed vector $\alpha$ we will call the algorithm $\alpha$ -PSF-ALG. This algorithm asks each voter to rank a random set of $\ell$ candidates. We say that a candidate $c$ is ranked by a voter $v$ if $c$ belongs to the set of $\ell$ candidates that $v$ was asked to rank. If $c$ is the $i$ -th most preferred among the candidates ranked by a voter, then $c$ receives the score of $\alpha_{i}$ from the voter. Such scores are summed up for each candidate, normalized by the number of voters who ranked the respective candidate, and the candidate with the highest total score is declared the winner. Pseudcode of the algorithm is given in Algorithm 1.

Below, we will show that for some positional scoring rules, by choosing the vector $\alpha$ carefully, we can find good approximations of winning candidates with high probability. First, through Theorem 1 we establish a relation between positional scoring functions $\lambda$ and vectors $\alpha$ that should be used to assess $\lambda$ ; the formula is not intuitive, and we will discuss it later on. In particular, we will explain which positional scoring functions can be well approximated using this approach, that is, we will discuss the structure of the class of positional scoring functions which are covered by the following theorem.

Theorem 1.

Fix a non-increasing sequence of $\ell$ reals $\alpha=(\alpha_{1},\ldots,\alpha_{\ell})$ and consider the positional scoring function $\lambda_{\alpha}$ defined by

[TABLE]

For a candidate $c\in C$ that is ranked by at least one voter, we denote by $X_{c}$ the random variable describing the total normalized score that $c$ was assigned by $\alpha$ -PSF-ALG. Then, the expected value ${{{\mathrm{E}}}}(X_{c})$ is equal to the $\lambda_{\alpha}$ -score of $c$ , and the probability that the score computed by $\alpha$ -PSF-ALG for $c$ differs from its expected value by a multiplicative factor of $1\pm\epsilon$ is upper-bounded by $2\exp\left(-\frac{\epsilon^{2}{{{\mathrm{E}}}}(X_{c})}{3}\right)$ , i.e.,

[TABLE]

Proof.

Let us fix a candidate $c\in C$ who is ranked by at least one voter. The process of computing the score of $c$ according to Algorithm 1 can be equivalently described as follows. We first decide on the number $x$ of voters we ask to rank $c$ . Second, we pick uniformly at random a set $V^{\prime}$ of $x$ voters such that all voters in $V^{\prime}$ are asked to rank $c$ and all voters in $V\setminus V^{\prime}$ are not asked to rank $c$ . Finally, we ask each voter from $V^{\prime}$ to rank $c$ and a randomly selected set of $\ell-1$ candidates. Let $N_{c}$ be a random variable describing the number of voters who rank $c$ . Further, for each voter $v$ , let $X_{v,c,i}$ denote the random variable equal 1 if $c$ is the $i$ -th candidate among those ranked by voter $v$ and zero otherwise. In particular, $X_{v,c,i}$ is zero when $v$ is not asked to rank $c$ . Observe that if $N_{c}=x$ , then the value of $X_{c}$ can be expressed as

[TABLE]

Further, let $A_{V^{\prime},c}$ be 1 if each voter from $V^{\prime}$ ranks $c$ and 0 otherwise. Similarly, let $A_{v,c}$ be 1 if $v$ ranks $c$ and 0 otherwise. Let ${{{\mathrm{ind}}}}_{v}(S,c,i)$ be equal to 1 if $c$ is ranked as the $i$ -th most preferred candidate among $S$ by $v$ and 0 otherwise. We next compute the conditional expected value ${{{\mathrm{E}}}}(X_{c}|N_{c}=x)$ . We first give the formal equalities and give reasoning for the more complicated ones afterwards.

[TABLE]

We will now explain some of the equalities in the above sequence. (3) is an effect of regrouping the summands; each summand ${{{\mathrm{E}}}}(X_{v,c,i}|A_{v,c}=1)$ in the previous line is added for each set $V^{\prime}\subseteq V$ of size $x$ which includes $v$ —there are ${n-1\choose x-1}$ such sets and $[v\in V^{\prime}]\cdot{{{\mathrm{P}}}}(A_{v,c}=1|A_{V^{\prime},c}=1)$ is the same as $[v\in V^{\prime}]$ . (5) holds for the following reason: A voter who ranked $c$ was asked to rank some set of $\ell$ candidates including $c$ . Each possible set has the same probability of being selected, thus this probability is $\nicefrac{{1}}{{{m-1\choose\ell-1}}}$ . (6) is true as we will show that $\sum_{S\subseteq C}[c\in S][|S|=\ell]{{{\mathrm{ind}}}}_{v}(S,c,i)={{{{\mathrm{pos}}}}_{v}(c)-1\choose i-1}\cdot{m-{{{\mathrm{pos}}}}_{v}(c)\choose\ell-i}$ . Consider a fixed voter $v$ , a fixed candidate $c$ , and a set $S\subseteq C$ such that

(i) $c\in S$ ,

(ii) $|S|=\ell$ , and

(iii) $v$ considers $c$ to be her $i$ -th most preferred candidate from $S$ .

Each such a set must consist of $i-1$ candidates that are ranked before $c$ by $v$ and $\ell-i$ candidates that are ranked after $c$ . Thus, there are ${{{{\mathrm{pos}}}}_{v}(c)-1\choose i-1}\cdot{m-{{{\mathrm{pos}}}}_{v}(c)\choose\ell-i}$ such sets. We refer to Fig. 1 for an illustration.

Next, we will use the Chernoff’s inequality to assess the probability that the computed score of a candidate $c$ does not differ from its true score by a factor of $\epsilon$ . We will first assess the conditional probability ${{{\mathrm{P}}}}\Big{(}\left|X_{c}-{{{\mathrm{E}}}}(X_{c})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c})|N_{c}=x\Big{)}$ . Observe that the conditional variables $\{X_{v,c,i}|N_{c}\}_{v\in V,i\in[\ell]}$ are not independent. For instance, if $X_{v,c,i}=1$ , then $X_{v,c,j}=0$ for each $j\neq i$ . However, they are all negatively correlated—intuitively meaning that if a variable becomes 1 (resp., 0), then the other variables are less likely to become 1 (resp., 0). Thus, we can still apply the Chernoff’s bound [AD11, Theorem 1.16, Corollary 1.10] which states that for any negatively-correlated random variables $X_{1},\ldots,X_{n}\in[0,1]$ such that $X=\sum_{i=1}^{n}X_{i}$ and any $\delta\in[0,1]$ it holds that

[TABLE]

It follows immediately that ${{{\mathrm{P}}}}(|X-{{{\mathrm{E}}}}(X)|\geq\delta{{{\mathrm{E}}}}(X))\leq 2\exp\left(-\delta^{2}{{{\mathrm{E}}}}(X)/3\right)$ .

Now, consider the variables $\frac{X_{v,c,i}\alpha_{i}}{\alpha_{1}}$ . These variables are from $[0,1]$ and from Eqs. 1, 2, 3, 4, 5, 6 and 7 we get that

[TABLE]

This yields

[TABLE]

Finally, using the binomial identity $(x+y)^{n}=\sum_{k=1}^{n}{n\choose k}x^{k}y^{n-k}$ we get

[TABLE]

This concludes the proof. ∎

Now, let us discuss the form of positional scoring functions $\lambda_{\alpha}(p)$ used in the statement of Theorem 1. First, observe that for $\ell=2$ , if we set $\alpha_{2}=0$ and $\alpha_{1}=1$ we have that $\lambda_{\alpha}(p)={p-1\choose 0}\cdot{m-p\choose 2-1}=m-p=\beta(p)$ . This means that by asking each voter to rank only two candidates, we can correctly (in expectation) assess the Borda scores of the candidates.

Corollary 2.

For a candidate $c$ the expected value of the score computed by Algorithm $(1,0)$ -SEP-ALG for $c$ is the Borda score of $S$ .

Unfortunately, not every positional scoring function can be efficiently assessed while asking each voter to rank only few candidates. For example, we can generalize Corollary 2 and show that for any vector of two elements $\alpha=(\alpha_{1},\alpha_{2})$ , the algorithm $\alpha$ -SEP-ALG can only compute scores that are affine transformations of the Borda scores (thus, for $\ell=2$ the algorithm can only be used to approximate the Borda rule).

We will now describe the class of all positional scoring functions which can be computed correctly in expectation by our algorithm for any fixed $\ell$ . Since each positional scoring function is based on some $m$ -dimensional vector $\beta=(\beta_{1},\beta_{2},\ldots,\beta_{m})$ which can be expressed as $\sum_{i=1}^{m}\alpha_{i}\cdot\beta_{i}$ , where $\alpha_{1}=(1,0,\ldots)$ , $\alpha_{2}=(0,1,0,\ldots)$ and so on, these $\alpha$ -vectors form a basis of the linear space of positional scoring functions.

Let $\mathit{Sep}_{\ell}=\{\lambda_{\alpha}\colon\alpha\in{{\mathbb{R}}}^{\ell}\}$ be the set of all positional scoring functions that can be computed (correctly in expectation) by our algorithm for a fixed $\ell$ . Since it holds for each two $\ell$ -element vectors $\alpha,\alpha^{\prime}\in{{\mathbb{R}}}^{\ell}$ that $\lambda_{\alpha+\alpha^{\prime}}(p)=\sum_{i=1}^{\ell}(\alpha_{i}+\alpha^{\prime}_{i}){p-1\choose i-1}{m-p\choose\ell-i}=\sum_{i=1}^{\ell}\alpha_{i}{p-1\choose i-1}{m-p\choose\ell-i}+\sum_{i=1}^{\ell}\alpha^{\prime}_{i}{p-1\choose i-1}\cdot{m-p\choose\ell-i}=\lambda_{\alpha}(p)+\lambda_{\alpha^{\prime}}(p),$ we have that $\mathit{Sep}_{\ell}$ is a linear space too.

Thus, $\mathit{Sep}_{\ell}$ is an $\ell$ -dimensional linear subspace of the $m$ -dimensional space of all positional scoring functions, and so we can compactly describe it by providing $\ell$ scoring functions forming a basis of $\mathit{Sep}_{\ell}$ . Figure 2 visually illustrates the scoring functions forming a basis for $\ell\in\{2,4,8\}$ . In other words, for a given value of $\ell$ , we can use Theorem 1 to correctly compute (in expectation) all scoring functions which can be obtained as linear combinations of the scoring functions depicted in Figure 2.

Finally, let us give some intuition regarding the probabilities assessed in Theorem 1. For example, for $m=21$ candidates and $n$ voters the Borda score of a winning candidate is at least $10n$ . Assume that we want to ask each voter to compare only two candidates, and set $\epsilon=0.01$ . When assessing the score of a winning candidate, to get $p_{\epsilon}<0.001$ we need about 72 thousands voters. For one million voters, this probability drops below $\nicefrac{{4}}{{10^{42}}}$ .

Finally, note that Theorem 1 applies to any candidate, not only to election winners. This makes the result slightly more general, since it also applies to e.g., social welfare functions, where the goal is to output a ranking of the candidates instead of a single winner.

3.2 Minimax Rule

We will now investigate whether the Minimax rule can be well approximated when each voter is only asked to rank a few candidates. We will use an algorithm similar to Algorithm 1: each voter $v$ ranks a subset of candidates $S_{v}$ and whenever two candidates $c,c^{\prime}\in S_{v}$ are ranked by a voter $v$ , we use her preference list to estimate ${{{\mathrm{sc}}}}_{{{{\mathrm{MM}}}}}(c,c^{\prime})$ . Notably, we scale the values ${{{\mathrm{sc}}}}_{{{{\mathrm{MM}}}}}(c,c^{\prime})$ for each two candidates $c,c^{\prime}\in C$ by the number of times they were compared and use these normalized values to compute the Minimax winner. This algorithm is formalized in Algorithm 2.

Theorem 3.

For each candidate $c\in C$ the probability that the total normalized score computed by Algorithm 2 for $c$ differs from the true Minimax score of $c$ by a multiplicative factor of at least $1\pm\epsilon$ is upper-bounded by:

[TABLE]

Proof.

First, let us fix a pair of candidates $c,c^{\prime}\in C$ and let $X_{c,c^{\prime}}$ be the random variable describing the value $n\cdot\frac{\mathrm{S}[c,c^{\prime}]}{\mathrm{S}[c,c^{\prime}]+\mathrm{S}[c^{\prime},c]}$ as computed by Algorithm 2. Similarly as in the proof of Theorem 1 we can express $X_{c,c^{\prime}}$ as a sum of negatively correlated random variables.

Specifically, computing $\mathrm{S}[c,c^{\prime}]$ according to Algorithm 2 can be equivalently described as follows: First, we decide on how many voters will be asked to compare $c$ and $c^{\prime}$ . Let $N_{c,c^{\prime}}$ be the random variable describing this number of voters. Second, assuming $N_{c,c^{\prime}}=x$ , we pick uniformly at random a set $V^{\prime}$ of $x$ voters, and we ask them to compare $c$ and $c^{\prime}$ . For each voter $v$ , let $X_{v,c,c^{\prime}}$ denote the random variable equal 1 if voter $v$ said that she prefers $c$ to $c^{\prime}$ , and 0 otherwise. In particular, $X_{v,c,c^{\prime}}$ is zero when $v$ is not asked to compare $c$ and $c^{\prime}$ . Observe that if $N_{c,c^{\prime}}=x$ , then $\mathrm{S}[c,c^{\prime}]+\mathrm{S}[c^{\prime},c]=x$ , and so the value of $X_{c,c^{\prime}}$ can be expressed as

[TABLE]

We next compute the conditional expected value ${{{\mathrm{E}}}}(X_{c}|N_{c,c^{\prime}}=x)$ :

[TABLE]

Next, we will use the Chernoff’s inequality to upper-bound the probability that the value of random variable $X_{c,c^{\prime}}$ does not differ from its expected value by a factor of $\epsilon$ . We first look at the conditional probability ${{{\mathrm{P}}}}\Big{(}\left|X_{c,c^{\prime}}-{{{\mathrm{E}}}}(X_{c,c^{\prime}})\right|\geq\epsilon{{{\mathrm{E}}}}(X_{c,c^{\prime}})|N_{c,c^{\prime}}=x\Big{)}$ . As in the proof of Theorem 1, we note that the conditional variables $\{X_{v,c,c^{\prime}}|N_{c,c^{\prime}}=x\}_{v\in V}$ are not independent, yet they are all negatively correlated—the fact that one variable becomes 1 (resp., 0) can only decrease the probabilities that some other becomes 1 (resp., 0). Thus, we can still apply the Chernoff’s bound [AD11, Theorem 1.16, Corollary 1.10], which states that for any negatively-correlated random variables $X_{1},\ldots,X_{n}\in[0,1]$ such that $X=\sum_{i=1}^{n}X_{i}$ and any $\delta\in[0,1]$ it holds that

[TABLE]

Since, for each $v\in V$ , we have $X_{v,c,c^{\prime}}\in\{0,1\}$ , and ${{{\mathrm{E}}}}\left(\sum_{v\in V}X_{v,c,c^{\prime}}|N_{c,c^{\prime}}=x\right)=\frac{x}{n}{{{\mathrm{sc}}}}_{{{{\mathrm{MM}}}}}(c,c^{\prime})$ , we get that:

[TABLE]

Next, we get that:

[TABLE]

Notice that ${{{\mathrm{P}}}}(N_{c,c^{\prime}}=x)$ can be represented by the following. First we decide for $x$ out of $n$ voters to rank $c$ and $c^{\prime}$ . We then ask these $x$ voters to rank $l-2$ out of $m-2$ remaining candidates and ask all other $n-x$ voters to not rank $c$ and $c^{\prime}$ . This can be modeled by ${m\choose\ell}-{m-2\choose\ell-2}$ as $m\choose\ell$ is the total number of possible sets to ask a voter to rank and as discussed before, there are $m-2\choose\ell-2$ sets that contain $c$ and $c^{\prime}$ . Hence

[TABLE]

Again using the binomial identity $(x+y)^{n}=\sum_{k=1}^{n}{n\choose k}x^{k}y^{n-k}$ , we get

[TABLE]

Finally, let $c_{\min}=\operatorname*{arg\,min}_{c^{\prime}\neq c}{{{\mathrm{sc}}}}_{{{{\mathrm{MM}}}}}(c,c^{\prime})$ . The probability that for candidate $c$ the score computed by Algorithm 2 differs from its true Minimax score by a multiplicative factor of at least $1\pm\epsilon$ is upper-bounded by

[TABLE]

Clearly, we have:

[TABLE]

Further, since by definition for each $c^{\prime}$ we have ${{{\mathrm{sc}}}}_{{{{\mathrm{MM}}}}}(c,c^{\prime})\geq{{{\mathrm{sc}}}}_{{{{\mathrm{MM}}}}}(c,c_{\min})$ , it holds that:

[TABLE]

Thus we get that

[TABLE]

4 Deterministic Approach ( $\ell$ -Truncated Elections)

When not asking each voter about each candidate, one always has to decide whether each voter is asked about random candidates or about specific ones. On the one hand, asking about specific positions in preference rankings, allows one to focus on the top ones that seem to contain more relevant information; especially when the goal is to select the winner, who—intuitively—is more likely to appear in top positions. On the other hand, asking voters about random candidates might be more advantageous as the input may contain dependencies between candidates that are not known a priori.

In this section we investigate the case when each voter is asked about her $\ell$ most preferred candidates. We will then describe an algorithm that is guaranteed to approximate the true score at least as good as any other algorithm and analyze its performance for Borda and for Minimax. We will then show a general lower bound on the approximation ratio, that is, we show that no algorithm can approximate the true score in the worst case arbitrarily good and see that it matches the bound for Borda and almost matches the bound we computed for Minimax.

4.1 The Best Approximation Algorithm for $\ell$ -Truncated Elections

Let us start by describing the algorithm that for each $\ell$ -truncated instance gives the best possible approximation guarantee, that is, the best approximation of the true winner in the worst-case full preference profile that induces the given $\ell$ -truncated instance. We mention that the idea of this algorithm is very similar to the one behind the algorithms for minimizing the maximal regret [LB11], yet the analysis of the approximation ratio of the algorithm is new to this paper.

Consider an election $E$ and let $E_{\ell}$ be the $\ell$ -truncated instance obtained from $E$ . Observe that when given $E_{\ell}$ and choosing a winner, the worst case occurs if the picked winner is ranked at the very last position by all voters that did not rank this candidate in $E$ among the first $\ell$ positions, and the true winner (that our algorithm did not pick) is ranked at position $\ell+1$ by each voter who did not rank this candidate.

For each candidate $c$ we compute two scores: The worst possible score that $c$ is guaranteed to get (denoted by $\operatorname*{worst}(c)$ )—this score is obtained when $c$ is ranked last whenever it is not among the top- $\ell$ positions—and the best possible score that $c$ can get (denoted by $\operatorname*{best}(c)$ )—the score obtained by ranking $c$ at position $\ell+1$ whenever it is not ranked among the first $\ell$ positions. Let $a,b_{1},$ and $b_{2}$ be the candidates with the highest $\operatorname*{worst}$ , the highest $\operatorname*{best}$ , and the second highest $\operatorname*{best}$ score, respectively. If any candidate $c\neq b_{1}$ is declared winner, then we can guarantee an approximation ratio of $\nicefrac{{\operatorname*{worst}(c)}}{{\operatorname*{best}(b_{1})}}$ , which is clearly maximized by $c=a$ .

If candidate $b_{1}$ is declared winner, then we can guarantee an approximation ratio of $\nicefrac{{\operatorname*{worst}(b_{1})}}{{\operatorname*{best}(b_{2})}}$ . Thus, an optimal approximation ratio is achieved by an algorithm that computes all the possible scores and then checks whether $a$ or $b_{1}$ guarantees a better result. If $\nicefrac{{\operatorname*{worst}(a)}}{{\operatorname*{best}(b_{1})}}\geq\nicefrac{{\operatorname*{worst}(b_{1})}}{{\operatorname*{best}(b_{2})}}$ it declares $a$ the winner and otherwise it picks $b_{1}$ .

We first show that these two different cases (sometimes choosing $a$ and other times choosing $b_{1}$ guaranteeing a better approximation ratio) can both occur. Afterwards we analyze the guarantees given by each of the two cases and conclude with a family of instances that prove that the obtained results are tight.

Example 1.

Consider the following two instances, each with $5$ voters $\{v_{1},v_{2},v_{3},v_{4},v_{5}\}$ , $6$ candidates $\{a,b,c,d,e,f\}$ , and $\ell=3$ :

[TABLE]

Assume our goal is to find the candidate that would best approximate the Borda winner. The scores are presented in the two tables below:

[TABLE]

.

In both instances $a$ is the candidate with the highest $\operatorname*{worst}$ score and $b$ is (among) the candidates with highest $\operatorname*{best}$ score. In instance $I_{1}$ , it is best to declare $b$ winner as it guarantees an approximation ratio of $\nicefrac{{15}}{{16}}>\nicefrac{{16}}{{19}}$ . In instance $I_{2}$ on the other hand, it is best to declare $a$ winner since it guarantees an approximation ratio of $\nicefrac{{16}}{{17}}>\nicefrac{{13}}{{16}}$ . Notice that the second term in each inequality is the approximation ratio guaranteed by choosing the respective other candidate.

Interestingly, while our algorithm provides the best possible approximation, it can select a candidate that is not a possible winner, i.e., that is not a winner in any profile consistent with the truncated ballot at hand.

Example 2.

Consider the following instance with $5$ candidates $a,b,c,d,e$ , four voters $v_{1},v_{2},\ldots,v_{4}$ , and $\lambda=(3,1,1,1,0)$ .

[TABLE]

It holds that $\operatorname*{worst}(a)=4$ , $\operatorname*{worst}(b)=\operatorname*{worst}(c)=\operatorname*{worst}(d)=\operatorname*{worst}(e)=3$ , $\operatorname*{best}(a)=4$ and $\operatorname*{best}(b)=\operatorname*{best}(c)=\operatorname*{best}(d)=\operatorname*{best}(e)=6$ and therefore declaring $a$ winning achieves the best approximation ratio. However, in any election that is consistent with the given truncated election, at least two candidates in $\{b,c,d,e\}$ get at least 5 points while $a$ always gets 4 points. Thus, $a$ is not a possible winner.

4.2 Positional Scoring Rules: Approximation Guarantees for $\ell$ -Truncated Elections

In this section we continue our analysis of the algorithm from Section 4.1, focusing on how well it approximates positional scoring rules having only access to $\ell$ -truncated elections. We will now prove guarantees that each of the two rules (choosing the candidate with highest $\operatorname*{worst}$ respectively $\operatorname*{best}$ score) provide.

Theorem 4.

Let $\mathcal{R}$ be a positional scoring rule defined by the scoring function $\lambda(i)=\alpha_{i}$ . The algorithm from Section 4.1 for $\ell$ -truncated elections gives an approximation guarantee of

[TABLE]

Proof.

Let us first assume that the algorithm picks the candidate $a$ with the highest $\operatorname*{worst}$ score as a winner. The average $\operatorname*{worst}$ score each candidate gets is $\operatorname*{avg}_{w}=\nicefrac{{n}}{{m}}\cdot\sum_{i=1}^{\ell}\alpha_{i}$ . Thus, $\operatorname*{worst}(a)\geq\nicefrac{{n}}{{m}}\cdot\sum_{i=1}^{\ell}\alpha_{i}$ . Let $b$ be the candidate, different from $a$ that has maximal $\operatorname*{best}$ score. Clearly, $\operatorname*{worst}(b)\leq\operatorname*{worst}(a)$ . Further, if we fix $\operatorname*{worst}(b)$ , then $\operatorname*{best}(b)$ is maximized when candidate $b$ is ranked among the top $\ell$ positions (the positions that are counted for $\operatorname*{worst}(b)$ ) as few times as possible. This way the number of voters who do not rank $b$ is maximized—and these are the voters who can contribute additional score (apart from $\operatorname*{worst}(b)$ ) to $\operatorname*{best}(b)$ . That is, if $\operatorname*{worst}(b)$ is fixed, then $\operatorname*{best}(b)$ is higher when $b$ is ranked high by fewer voters rather than when it is ranked lower but by more voters. Consequently, we can lower bound $\operatorname*{best}(b)$ by:

[TABLE]

Thus, the approximation ratio in this case is at least

[TABLE]

We next turn to the case when the algorithm picks the candidate $b$ with maximum $\operatorname*{best}$ score. The average best score of a candidate is given by

[TABLE]

Clearly, $\operatorname*{best}(b)\geq\operatorname*{avg}_{b}$ . With a fixed $\operatorname*{best}(b)$ the $\operatorname*{worst}$ score of $b$ is minimized when $b$ is ranked by as few voters as possible (the reasoning is similar as in the previous case). If $b$ is ranked first by $x$ voters, then its $\operatorname*{best}$ score would be $\alpha_{1}x+(n-x)\alpha_{\ell+1}$ . By solving:

[TABLE]

we get that $x=\frac{\operatorname*{best}(b)-n\alpha_{\ell+1}}{\alpha_{1}-\alpha_{\ell+1}}$ . Thus, $b$ gets the $\operatorname*{worst}$ score of at least $\alpha_{1}\frac{\operatorname*{best}(b)-n\alpha_{\ell+1}}{\alpha_{1}-\alpha_{\ell+1}}$ . Consequently, we can lower-bound the approximation ratio by:

[TABLE]

It is easy to verify that for all $a,b,c,d$ with $0\leq a\leq c\leq d$ , $b\geq 0$ , and $\frac{(d-c)a}{c}\geq b$ it holds that $\frac{c}{d}\geq\frac{c-a}{d-a-b}$ . Substituting into this $a=\ell\alpha_{\ell+1}$ , $b=(m-\ell)\frac{\alpha_{\ell+1}^{2}}{\alpha_{1}}$ , $c=\sum_{i=1}^{\ell}\alpha_{i}$ and $d=m\alpha_{\ell+1}+\frac{\alpha_{1}-\alpha_{\ell+1}}{\alpha_{1}}\sum_{i=1}^{\ell}\alpha_{i}$ , we get that

[TABLE]

We will only show that $c\leq d$ and $\frac{(d-c)a}{c}\geq b$ as everything else is trivial. Observe that

[TABLE]

Since the algorithm always picks the value that results in a higher ratio, we get the thesis. ∎

Theorem 4 gives a very general result that applies to any positional scoring rule. For instance, for $k$ -approval we get the approximation ratio of $\nicefrac{{\ell}}{{m}}$ .

Corollary 5.

The algorithm from Section 4.1 for $k$ -approval with $\ell$ -truncated elections, $k>\ell$ , gives the approximation guarantee of $\nicefrac{{\ell}}{{m}}$ .

Proof.

We instantiate the expression from Theorem 4 for $k$ -approval:

[TABLE]

∎

For the Borda rule, we get the approximation of $\frac{\ell}{m+\frac{\ell}{m-1}\cdot\ell}$ which on the plot looks similarly to $\frac{\ell}{m}$ (see the left-hand side plot in Figure 3).

Corollary 6.

The algorithm from Section 4.1 for Borda with $\ell$ -truncated elections, gives the approximation guarantee of $\frac{\ell}{m+\frac{\ell}{m-1}\cdot\ell}$ .

Proof.

The approximation ratio follows from Theorem 4:

[TABLE]

∎

We conclude by providing intuitive explanation of instances that match the bound from Theorem 4. In these instances all candidates get roughly the same $\operatorname*{worst}$ score and there are two candidates $a,b$ that also get average $\operatorname*{worst}$ score but only appear as few times as possible in the first $\ell$ positions (in the first or second positions only). If candidate $a$ is declared the winner by any rule then she gets [math] points from all voters, that did not rank her in the first two positions and $b$ gets $m-\ell-1$ points from these voters. Otherwise the winning candidate gets [math] points from all voters that did not rank her in the first $\ell$ positions and $a$ gets $m-\ell-1$ points whenever she is not ranked first or second. Notice that no rule can distinguish between the different instances we just constructed and therefore building the instance after the rule picked a winner is permitted. In either case, the candidate that is declared winner by any rule gets points equal to the average $\operatorname*{worst}$ score and the “true winner” $a$ or $b$ gets $\operatorname*{avg}_{\operatorname*{worst}}+(n-\frac{\operatorname*{avg}_{\operatorname*{worst}}}{m-1})(m-\ell-1)$ points. Observe that this gives exactly the same conditions as for the computation of $\mathrm{approx}_{1}$ in the proof of Theorem 4 and hence we have a matching upper bound.

4.3 Minimax Rule

Let us now move to the analysis of the Minimax rule. We start by showing that no deterministic algorithm for Minimax can guarantee a better approximation ratio than $\frac{1}{m-\ell}$ .

Theorem 7.

There exists no rule for $\ell$ -truncated elections $\mathcal{F}$ that is a $\frac{1}{m-\ell-\varepsilon}$ -approximation of the Minimax rule for any $\varepsilon>0$ .

Proof.

Consider the following $\ell$ -truncated instance of election: Let $C=\{c_{0},c_{1},\ldots,c_{m-1}\}$ be the set of candidates and let $V=\{v_{0},v_{1},\ldots v_{m-1}\}$ be the set of voters. Let the truncated preference list of voter $v_{i}$ be

[TABLE]

Due to symmetry, any candidate can be declared winner by each algorithm. For the sake of simplicity, let us assume that candidate $c_{2}$ is declared winner. We can then complement this instance by inserting all candidates that were not ranked by some voter $v_{i}$ in the order suggested by the subscript, that is,

[TABLE]

On the one hand, only voter $v_{2}$ prefers $c_{2}$ over $c_{1}$ and all other $m-1$ voters prefer $c_{1}$ over $c_{2}$ . Thus, the Minimax score of $c_{2}$ is 1. On the other hand, the Minimax score of $c_{1}$ is $m-\ell$ . Indeed, the strongest contender to $c_{1}$ is $c_{m}$ and $\ell$ voters prefer $c_{m}$ over $c_{1}$ (for $\ell<m$ ) while $m-\ell$ voters prefer $c_{1}$ over $c_{m}$ . Thus, no (deterministic) algorithm can achieve a better approximation ration than $\nicefrac{{{{{\mathrm{sc}}}}_{{{{\mathrm{MM}}}}}(c_{2})}}{{{{{\mathrm{sc}}}}_{{{{\mathrm{MM}}}}}(c_{1})}}=\nicefrac{{1}}{{m-\ell}}$ . ∎

Note that in the construction in Theorem 7 one can increase the number of voters to be much larger than the number of candidates by simply copying all voters a sufficient number of times.

Theorem 7 already shows that with $\ell$ -trucnated ballots Minimax cannot be well approximated. In particular, the bound for the Minimax rule is much worse than for scoring-based rules. We do not know whether the bound from Theorem 7 is tight. Yet, we can show that a simplified variant of the algorithm from Section 4.1 that computes the maximum $\operatorname*{worst}$ score of each candidate and declares the one with the highest score winner, achieves an approximation ratio of $\frac{1}{(m-\ell)\cdot\left(1+\frac{\ell^{2}}{m^{2}-\ell^{2}-m+\ell}\right)}$ . This approximation ratio is lower-bounded by $\frac{1}{m-\nicefrac{{\ell}}{{2}}}$ , which means that for reasonably small $\ell$ it (almost) matches the upper bound from Theorem 7 (see the right-hand side plot in Figure 3 for the comparison of these two bounds).

Theorem 8.

The algorithm from Section 4.1 approximates the Minimax rule in $\ell$ -truncated elections within a factor of

[TABLE]

Proof.

We will prove our claim by showing that if all candidates have a $\operatorname*{worst}$ score of at most $x$ , then all candidates have a maximum $\operatorname*{best}$ score of at most $x\cdot(m-\ell)\cdot\left(1+\frac{\ell^{2}}{m^{2}-\ell^{2}-m+\ell}\right)$ .

Assume towards a contradiction that all candidates have a $\operatorname*{worst}$ score of at most $x$ and there exists a candidate $f$ with $\operatorname*{best}(f)>x\cdot(m-\ell)\cdot\left(1+\frac{\ell^{2}}{m^{2}-\ell^{2}-m+\ell}\right)$ . We say that $c\succ_{v}^{t}d$ for voter $v$ and candidates $c$ and $d$ if and only if $c$ is preferred over $d$ by voter $v$ in the $\ell$ -truncated instance, that is, either $c$ and $d$ are both among the $\ell$ most preferred candidates of $v$ and $c$ is preferred over $d$ or only $c$ is among the first $\ell$ candidates. Since

[TABLE]

and $\operatorname*{best}(f)>x\cdot(m-\ell)\cdot\left(1+\frac{\ell^{2}}{m^{2}-\ell^{2}-m+\ell}\right)$ , it follows that for all $c\in C\setminus\{f\}$ we have

[TABLE]

Hence for all $c\neq f$ it holds that

[TABLE]

We now analyze $\operatorname*{worst}(f)$ . Let $\operatorname{occ}(d)$ be the number of times a candidate $d$ occurs in the truncated instance, that is, the number of voters that rank candidate $d$ among the first $\ell$ positions. First, observe that for each candidate $c$ it holds that

[TABLE]

as $c$ is preferred over any $c^{\prime}$ at least that often. Hence,

[TABLE]

Since, by assumption, $\operatorname*{worst}(f)\leq x$ , it holds that

[TABLE]

or, equivalently,

[TABLE]

Second, notice that if $f$ is ranked among the top $\ell$ candidates by at most $\operatorname{occ}(f)$ voters, then there are $n-\operatorname{occ}(f)$ voters that do not rank $f$ among the first $\ell$ positions and by pigeonhole principle there is a candidate $c\in C\setminus\{f\}$ with $|\{v\mid c\succ_{v}^{t}f\}|\geq(n-\operatorname{occ}(f))\nicefrac{{\ell}}{{m-1}}$ . As discussed above, from Equation 9 it follows that

[TABLE]

which is equivalent to

[TABLE]

Plugging in Equation 10 into this inequality, we get that

[TABLE]

Notice that on the other hand $x\geq\nicefrac{{n}}{{m}}$ as by pigeonhole principle there is a candidate that is ranked first at least $\nicefrac{{n}}{{m}}$ times and hence has a $\operatorname*{worst}$ score of at least $\nicefrac{{n}}{{m}}$ . Thus, we have reached a contradiction, completing the first part of the proof. We finish the proof by proving

[TABLE]

Observe that

[TABLE]

5 Experimental Evaluation

In Sections 3 and 4 we have assessed the worst-case guarantees of our approximation algorithms. In this section we investigate how these guarantees depend on particular distributions of the the voters’ preferences. We tested the following distributions over preference rankings:

Impartial Culture (IC).

Under the Impartial Culture model each ranking over the candidates is equally probable.

One-dimensional Euclidean Model (1D).

First, we associate each voter and each candidate with a point from the interval $[0,1]$ —these points are sampled independently and uniformly at random. Then, each voter ranks the candidates according to her distance, preferring the ones which are closer to those which are farther.

Mixture of Mallows’ Models (MMM).

In the Mallows’ model [Mal57] we are given a reference ranking $\pi$ and a real value $\phi\in[0,1]$ ; the probability of sampling a ranking $\tau$ is proportional to $\phi^{d_{K}(\pi,\tau)}$ , where $d_{K}(\pi,\tau)$ is the number of swaps of adjacent candidates that are required to turn $\phi$ into $\tau$ . We used a mixture of three Mallows’ models: for each of the three models we drawn the reference ranking $\pi$ and the real value $\phi$ uniformly at random. Next, we sampled the parameters $\lambda_{1},\lambda_{2},\lambda_{3}$ that sum up to one; to generate a ranking we first pick one of the three models, we pick the $i$ -th model with probability $\lambda_{i}$ , and we generate the ranking according to the Mallows’ model we picked.

Single Peaked Impartial Culture (SPIC).

In order to generate a profile we first randomly select a reference ranking. Then, we generate rankings that are single-peaked with respect to the reference ranking. Each such single peaked ranking is equally probable. For a definition and discussion on single-peaked preferences we refer the reader to the book chapter by Elkind et al. [ELP17].

For each distributions $\mathcal{D}$ over preferences and for each approximation algorithm $\mathcal{A}$ we ran computer simulations as follows: We set the number of candidates to $m=50$ and tested for $\ell\in\{2,5,8\}$ . We ran simulations for the number of voters $n$ ranging from $10$ to $1000$ in steps of $25$ . For each combination of values of $(\ell,n)$ we ran 500 independent experiments, each time computing the ratio $r(\mathcal{A},\mathcal{D})$ between the score of the candidate returned by algorithm $\mathcal{A}$ to the score of the optimal candidate. The averages of these ratios (averaged over the aforementioned 500 simulations) and the corresponding standard deviations for the Borda and the Minimax rules are depicted in Figure 4 and Figure 5, respectively.

5.1 Approximation Algorithms for the Borda Rule

We empirically tested how well the two algorithms that we analyzed theoretically in the previous sections approximate the Borda rule. Specifically, we implemented Algorithm 1—which we will refer to as Randomized, and the algorithm described in Section 4.1. We also checked two other deterministic heuristics, that appear simple and intuitive:

The variant of the deterministic algorithm from Section 4.1 that always picks the candidate with the highest $\operatorname*{worst}$ score. 2. 2.

An algorithm we call Deter-avg that, for each voter $v_{i}$ and candidate $c_{j}$ assigns to $c_{j}$ the score

(a)

$\beta({{{\mathrm{pos}}}}_{i}(c_{j}))$ if ${{{\mathrm{pos}}}}_{i}(c_{j})\leq\ell$ , 2. (b)

the average score of the unranked positions $\sum_{p=\ell+1^{m}}\beta(p)/(m-\ell)$ , otherwise.

Then, the algorithm picks the candidate with the highest total score.

The three deterministic algorithms were almost indistinguishable in our simulations—Deter-avg was slightly better than the other two. Thus, for readability we present the results only for Deter-avg and Randomized and omit the description of the results for the other two deterministic algorithms. We found the following:

For preferences with no or with little structure, such as those generated by IC and SPIC, the deterministic algorithm gives better results. For preferences with more structure, e.g., those obtained from 1D and MMM models, the randomized algorithm significantly outperforms the deterministic ones. 2. 2.

For each preference distribution that we tested the randomized algorithm gives high quality approximations unless the number of voters is very small. Our results suggest to ask each voter to rank a random subsets of alternatives when the goal is to approximate the Borda rule with limited information from each voter and the number of voters exceeds a couple of hundreds.

5.2 Approximation Algorithms for the Minimax Rule

Similarly to Section 5.1, we empirically tested how well the randomized algorithm (Algorithm 2) and the deterministic algorithm from Section 4.1 approximate the Minimax rule. We refer to the two algorithms as Randomized and Deterministic, respectively. We also tested two other natural heuristics. For each two candidates $c$ and $c^{\prime}$ , let $n(c,c^{\prime})$ denote the number of voters who (i) rank $c$ and $c^{\prime}$ among their $\ell$ most preferred candidates and prefer $c$ over $c^{\prime}$ or (ii) who rank $c$ but not $c^{\prime}$ among their top $\ell$ positions. Then:

In our first heuristic algorithm, for each pair of candidates, $c$ and $c^{\prime}$ , we use a method similar to Minimax, but we replace ${{{\mathrm{sc}}}}_{{{{\mathrm{MM}}}}}(c,c^{\prime})$ by $n(c,c^{\prime})$ . Then, similarly as in the case of the original Minimax rule we compute for each candidate $c$ the score $\min_{c^{\prime}\neq c}n(c,c^{\prime})$ and pick the candidate $w$ with the maximal score. 2. 2.

In the second heuristic, we set replace ${{{\mathrm{sc}}}}_{{{{\mathrm{MM}}}}}(c,c^{\prime})$ by

[TABLE]

In our simulation Deterministic outperformed the two heuristic algorithms we mentioned above, hence we present our results only for Deterministic and Randomized. We observed the following:

The randomized algorithm for the Minimax rule needs to ask each voter to be compare more candidates than in case of Borda to achieve a good approximation. For $m=50$ candidates, asking each voter to compare $\ell=8$ of them already gave good results for sufficiently many voters. 2. 2.

The deterministic algorithm usually performs better than the randomized one, yet there are distributions (e.g., the one-dimensional Euclidean model) where the quality of winners returned by the deterministic algorithm is much worse than those returned by the randomized algorithm. On the other hand, for each distribution that we tested, the randomized algorithm consistently was giving good results when the number of voters and the number of candidates to ask each voter to rank were sufficiently large.

6 Conclusion

In this paper we theoretically and experimentally analyzed how well certain election rules can be approximated when we are given only parts of voters’ preferences. We compared two methods of eliciting voters’ preferences, (i) the randomized method, where each voter is asked to compare a randomly selected subset of $\ell$ alternatives, and (2) the deterministic method, where we ask each voter to provide a ranking of her $\ell$ most preferred candidates. We investigated how well one can approximate positional scoring rules and the Minimax method through one of these two elicitation methods, providing both upper-bounds on the approximation ratio (impossibility results), and providing algorithms matching these bounds.

We conclude that the randomized approach is usually superior; the exceptions include preference distributions with little or no structure, which rarely appear in practice. For the Borda rule, with hundreds of voters it is usually sufficient to ask each voter to compare two random candidates to achieve a high approximation guarantee. Approximating the Minimax rule is harder: one typically needs more voters and to ask them to compare more candidates—e.g., for $m=50$ candidates, we obtained high approximation guarantees for the Minimax rule only when we set the number of voters to around thousand and $\ell=8$ .

Acknowledgments

Piotr Skowron was supported by a postdoctoral fellowship of the Alexander von Humboldt Foundation, Germany, and by the Foundation for Polish Science within the Homing programme (Project title: ”Normative Comparison of Multiwinner Election Rules”).

Bibliography21

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[ABE + 18] E. Anshelevich, O. Bhardwaj, E. Elkind, J. Postl, and P. Skowron. Approximating optimal social choice under metric preferences. Artificial Intelligence , 264:27–51, 2018.
2[AD 11] A. Auger and B. Doerr. Theory of Randomized Search Heuristics: Foundations and Recent Developments . World Scientific Publishing, 2011.
3[AP 17] E. Anshelevich and J. Postl. Randomized social choice functions under metric preferences. Journal of Artificial Intelligence Research , 58:797–827, 2017.
4[BCH + 15] C. Boutilier, I. Caragiannis, S. Haber, T. Lu, A. D. Procaccia, and O. Sheffet. Optimal social choice functions: A utilitarian view. Artificial Intelligence , 227:190–213, 2015.
5[BNPS 17] G. Benade, S. Nath, A. Procaccia, and N. Shah:. Preference elicitation for participatory budgeting. In Proceedings of the 31st AAAI Conference on Artificial Intelligence , pages 376–382, 2017.
6[BPQ 19] G. Benadé, A. Procaccia, and M. Qiao. Low-distortion social welfare functions. 2019. To appear.
7[BR 15] C. Boutilier and J. Rosenschein. Incomplete information and communication in voting. In F. Brandt, V. Conitzer, U. Endriss, J. Lang, and A. D. Procaccia, editors, Handbook of Computational Social Choice , chapter 10. Cambridge University Press, 2015.
8[CP 11] I. Caragiannis and A. D. Procaccia. Voting almost maximizes social welfare despite limited communication. Artificial Intelligence , 175(9–10):1655–1671, 2011.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Comparing Election Methods Where Each Voter Ranks Only Few Candidates

Abstract

1 Introduction

Our Contribution

Related Work

2 Preliminaries

Definition 1**.**

3 Randomized Approach

3.1 Scoring Rules

Theorem 1**.**

Proof.

Corollary 2**.**

3.2 Minimax Rule

Theorem 3**.**

Proof.

4 Deterministic Approach (ℓ\ellℓ-Truncated Elections)

4.1 The Best Approximation Algorithm for ℓ\ellℓ-Truncated Elections

Example 1**.**

Example 2**.**

4.2 Positional Scoring Rules: Approximation Guarantees for ℓ\ellℓ-Truncated Elections

Theorem 4**.**

Proof.

Corollary 5**.**

Proof.

Corollary 6**.**

Proof.

4.3 Minimax Rule

Theorem 7**.**

Proof.

Theorem 8**.**

Proof.

5 Experimental Evaluation

5.1 Approximation Algorithms for the Borda Rule

5.2 Approximation Algorithms for the Minimax Rule

6 Conclusion

Acknowledgments

Definition 1.

Theorem 1.

Corollary 2.

Theorem 3.

4 Deterministic Approach ( $\ell$ -Truncated Elections)

4.1 The Best Approximation Algorithm for $\ell$ -Truncated Elections

Example 1.

Example 2.

4.2 Positional Scoring Rules: Approximation Guarantees for $\ell$ -Truncated Elections

Theorem 4.

Corollary 5.

Corollary 6.

Theorem 7.

Theorem 8.