Effects of sampling skewness of the importance-weighted risk estimator   on model selection

Wouter M. Kouw; Marco Loog

arXiv:1804.07344·stat.ML·March 12, 2019

Effects of sampling skewness of the importance-weighted risk estimator on model selection

Wouter M. Kouw, Marco Loog

PDF

1 Repo

TL;DR

This paper investigates how skewness in the sampling distribution of importance-weighted risk estimators affects model selection, revealing biases that can lead to suboptimal regularization choices especially with small samples.

Contribution

It empirically demonstrates the skewness phenomenon in importance-weighted estimators and its impact on model selection under sample selection bias and covariate shift.

Findings

01

Importance-weighted estimators can be skewed, overestimating risk in the body of the distribution.

02

Large underestimates occur in the tail of the sampling distribution.

03

Skewness affects the choice of regularization parameters in importance-weighted validation.

Abstract

Importance-weighting is a popular and well-researched technique for dealing with sample selection bias and covariate shift. It has desirable characteristics such as unbiasedness, consistency and low computational complexity. However, weighting can have a detrimental effect on an estimator as well. In this work, we empirically show that the sampling distribution of an importance-weighted estimator can be skewed. For sample selection bias settings, and for small sample sizes, the importance-weighted risk estimator produces overestimates for datasets in the body of the sampling distribution, i.e. the majority of cases, and large underestimates for data sets in the tail of the sampling distribution. These over- and underestimates of the risk lead to suboptimal regularization parameters when used for importance-weighted validation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wmkouw/covshift-skewness
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.