Marginal and Conditional Importance Measures from Machine Learning   Models and Their Relationship with Conditional Average Treatment Effect

Mohammad Kaviul Anam Khan; Olli Saarela; Rafal Kustra

arXiv:2501.16988·stat.ML·January 30, 2025

Marginal and Conditional Importance Measures from Machine Learning Models and Their Relationship with Conditional Average Treatment Effect

Mohammad Kaviul Anam Khan, Olli Saarela, Rafal Kustra

PDF

Open Access

TL;DR

This paper introduces model-agnostic importance measures, MVIM and CVIM, for interpreting machine learning models and explores their relationship with the conditional average treatment effect, addressing bias issues under predictor correlation.

Contribution

It reintroduces and develops MVIM and CVIM as new importance metrics, analyzing their bias and relationship with CATE in black-box models.

Findings

01

MVIM is biased with highly correlated predictors.

02

CVIM reduces bias in importance measurement.

03

Both metrics relate quadratically to CATE.

Abstract

Interpreting black-box machine learning models is challenging due to their strong dependence on data and inherently non-parametric nature. This paper reintroduces the concept of importance through "Marginal Variable Importance Metric" (MVIM), a model-agnostic measure of predictor importance based on the true conditional expectation function. MVIM evaluates predictors' influence on continuous or discrete outcomes. A permutation-based estimation approach, inspired by \citet{breiman2001random} and \citet{fisher2019all}, is proposed to estimate MVIM. MVIM estimator is biased when predictors are highly correlated, as black-box models struggle to extrapolate in low-probability regions. To address this, we investigated the bias-variance decomposition of MVIM to understand the source and pattern of the bias under high correlation. A Conditional Variable Importance Metric (CVIM), adapted from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Causal Inference Techniques