Measuring the Impact of Missingness in Traffic Stop Data
Saatvik Kher, Johanna Hardin

TL;DR
This paper analyzes the missing data in traffic stop records, revealing non-random missingness patterns and proposing methods to quantify and visualize these biases, which significantly impact outcome assessments.
Contribution
It introduces a novel sensitivity analysis framework for missing data in traffic stop datasets, extending existing bias and treatment effect methods.
Findings
Data is not missing completely at random.
Bias calculations depend heavily on missingness assumptions.
Proposed visualization methods reveal missingness trends.
Abstract
In this article we explore the data available through the Stanford Open Policing Project. The data consist of information on millions of traffic stops across close to 100 different cities and highway patrols. Using a variety of metrics, we identify that the data is not missing completely at random. Furthermore, we develop ways of quantifying and visualizing missingness trends for different variables across the datasets. We follow up by performing a sensitivity analysis to extend work done on the outcome test as well as to extend work done on sharp bounds on the average treatment effect. We demonstrate that bias calculations can fundamentally shift depending on the assumptions made about the observations for which the race variable has not been recorded. We suggest ways that our missingness sensitivity analysis can be extended to myriad different contexts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques · Human Mobility and Location-Based Analysis
