A Cautionary Tail: A Framework and Case Study for Testing Predictive   Model Validity

Peter C. Casey; Kevin H. Wilson; David Yokum

arXiv:1807.03860·stat.AP·July 18, 2018

A Cautionary Tail: A Framework and Case Study for Testing Predictive Model Validity

Peter C. Casey, Kevin H. Wilson, David Yokum

PDF

Open Access

TL;DR

This paper introduces a framework for testing the validity of predictive models, highlighting the importance of field assessments to identify biases and ensure models perform well in real-world scenarios.

Contribution

It presents a novel field assessment framework for validating predictive models and demonstrates its application through a case study on rat infestation prediction in Washington, D.C.

Findings

01

Model performs well on new 311 data

02

Model fails to predict inspection outcomes accurately

03

Field assessments reveal biases not detected by traditional testing

Abstract

Data scientists frequently train predictive models on administrative data. However, the process that generates this data can bias predictive models, making it important to test models against their intended use. We provide a field assessment framework that we use to validate a model predicting rat infestations in Washington, D.C. The model was developed with data from the city's 311 service request system. Although the model performs well against new 311 data, we find that it does not perform well when predicting the outcomes of inspections in our field assessment. We recommend that data scientists expand the use of field assessments to test their models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCrime Patterns and Interventions · Data-Driven Disease Surveillance · Electoral Systems and Political Participation