Predicting Census Survey Response Rates With Parsimonious Additive Models and Structured Interactions
Shibal Ibrahim, Peter Radchenko, Emanuel Ben-David, Rahul Mazumder

TL;DR
This paper develops interpretable nonparametric additive models with structured interactions to predict census survey response rates, achieving accuracy comparable to black-box methods while maintaining interpretability.
Contribution
It introduces a novel sparse additive modeling approach with structured interactions, extending computational methods for practical, interpretable survey response prediction.
Findings
Models achieve prediction accuracy comparable to gradient boosting and neural networks.
Proposed methods enhance interpretability without sacrificing performance.
Open-source algorithms extend existing sparse additive model capabilities.
Abstract
In this paper, we consider the problem of predicting survey response rates using a family of flexible and interpretable nonparametric models. The study is motivated by the US Census Bureau's well-known ROAM application, which uses a linear regression model trained on the US Census Planning Database data to identify hard-to-survey areas. A crowdsourcing competition (Erdman and Bates, 2016) organized more than ten years ago revealed that machine learning methods based on ensembles of regression trees led to the best performance in predicting survey response rates; however, the corresponding models could not be adopted for the intended application due to their black-box nature. We consider nonparametric additive models with a small number of main and pairwise interaction effects using -based penalization. From a methodological viewpoint, we study our estimator's computational and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSurvey Methodology and Nonresponse · Data-Driven Disease Surveillance · Census and Population Estimation
