Total Variation Floodgate for Variable Importance Inference in Classification
Wenshuo Wang, Lucas Janson, Lihua Lei, Aaditya Ramdas

TL;DR
This paper introduces a model-agnostic measure called expected total variation (ETV) for assessing variable importance in classification, along with algorithms for statistical inference that provide confidence bounds, demonstrated through simulations and a case study.
Contribution
It proposes the ETV measure for variable importance in classification and develops algorithms for inference that do not depend on specific models.
Findings
Algorithms produce asymptotic lower confidence bounds for ETV.
Simulations validate the effectiveness of the proposed methods.
Case study demonstrates practical applicability in election analysis.
Abstract
Inferring variable importance is the key problem of many scientific studies, where researchers seek to learn the effect of a feature on the outcome in the presence of confounding variables . Focusing on classification problems, we define the expected total variation (ETV), which is an intuitive and deterministic measure of variable importance that does not rely on any model context. We then introduce algorithms for statistical inference on the ETV under design-based/model-X assumptions. These algorithms build on the floodgate notion for regression problems (Zhang and Janson 2020). The algorithms we introduce can leverage any user-specified regression function and produce asymptotic lower confidence bounds for the ETV. We show the effectiveness of our algorithms with simulations and a case study in conjoint analysis on the US general election.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Methods and Inference · Advanced Statistical Process Monitoring
