Targeted Cross-Validation

Jiawei Zhang; Jie Ding; Yuhong Yang

arXiv:2109.06949·stat.ML·February 21, 2022

Targeted Cross-Validation

Jiawei Zhang, Jie Ding, Yuhong Yang

PDF

Open Access

TL;DR

This paper introduces Targeted Cross-Validation (TCV), a method for selecting models based on a weighted loss focused on specific regions, demonstrating its consistency and advantages over traditional methods in dynamic, high-dimensional settings.

Contribution

The paper proposes TCV, a novel model selection approach using weighted $L_2$ loss, with theoretical guarantees and applicability to complex, changing data environments.

Findings

01

TCV is consistent in selecting the best model under weighted $L_2$ loss.

02

Experimental results show TCV outperforms global CV and local data approaches.

03

The method is applicable to high-dimensional, dynamic data scenarios.

Abstract

In many applications, we have access to the complete dataset but are only interested in the prediction of a particular region of predictor variables. A standard approach is to find the globally best modeling method from a set of candidate methods. However, it is perhaps rare in reality that one candidate method is uniformly better than the others. A natural approach for this scenario is to apply a weighted $L_{2}$ loss in performance assessment to reflect the region-specific interest. We propose a targeted cross-validation (TCV) to select models or procedures based on a general weighted $L_{2}$ loss. We show that the TCV is consistent in selecting the best performing candidate under the weighted $L_{2}$ loss. Experimental studies are used to demonstrate the use of TCV and its potential advantage over the global CV or the approach of using only local data for modeling a local region.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Advanced Statistical Methods and Models · Machine Learning and Data Classification