# How did Donald Trump Surprisingly Win the 2016 United States   Presidential Election? an Information-Theoretic Perspective (Clean Sensing   for Big Data Analytics:Optimal Strategies,Estimation Error Bounds Tighter   than the Cram\'{e}r-Rao Bound)

**Authors:** Weiyu Xu, Lifeng Lai, Amin Khajehnejad

arXiv: 1812.11891 · 2019-01-01

## TL;DR

This paper uses information theory to analyze opinion poll inaccuracies in the 2016 US election and proposes an optimal sensing strategy that balances data quality and quantity under cost constraints.

## Contribution

It introduces a general framework for optimal parameter estimation from heterogeneous, distorted data sources, and derives new lower bounds tighter than classical bounds.

## Key findings

- Larger sample size does not guarantee better polling accuracy.
- Optimal resource allocation improves estimation from heterogeneous data.
- New lower bounds on estimation error are tighter than Cramér-Rao bounds.

## Abstract

Donald Trump was lagging behind in nearly all opinion polls leading up to the 2016 US presidential election, but he surprisingly won the election. This raises the following important questions: 1) why most opinion polls were not accurate in 2016? and 2) how to improve the accuracies of opinion polls? In this paper, we study the inaccuracies of opinion polls in the 2016 election through the lens of information theory. We first propose a general framework of parameter estimation, called clean sensing (polling), which performs optimal parameter estimation with sensing cost constraints, from heterogeneous and potentially distorted data sources. We then cast the opinion polling as a problem of parameter estimation from potentially distorted heterogeneous data sources, and derive the optimal polling strategy using heterogenous and possibly distorted data under cost constraints. Our results show that a larger number of data samples do not necessarily lead to better polling accuracy, which give a possible explanation of the inaccuracies of opinion polls in 2016. The optimal sensing strategy should instead optimally allocate sensing resources over heterogenous data sources according to several factors including data quality, and, moreover, for a particular data source, it should strike an optimal balance between the quality of data samples, and the quantity of data samples.   As a byproduct of this research, in a general setting, we derive a group of new lower bounds on the mean-squared errors of general unbiased and biased parameter estimators. These new lower bounds can be tighter than the classical Cram\'{e}r-Rao bound (CRB) and Chapman-Robbins bound. Our derivations are via studying the Lagrange dual problems of certain convex programs. The classical Cram\'{e}r-Rao bound and Chapman-Robbins bound follow naturally from our results for special cases of these convex programs.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.11891/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1812.11891/full.md

## References

11 references — full list in the complete paper: https://tomesphere.com/paper/1812.11891/full.md

---
Source: https://tomesphere.com/paper/1812.11891