An Empirical Comparison of V-fold Penalisation and Cross Validation for Model Selection in Distribution-Free Regression
Charanpal Dhanjal (LIP6), Nicolas Baskiotis (LIP6), St\'ephan, Cl\'emen\c{c}on (LTCI), Nicolas Usunier (LIP6)

TL;DR
This paper empirically compares V-fold penalisation and cross-validation for model selection in distribution-free regression, highlighting their strengths and weaknesses across various datasets and proposing a modified penalisation technique.
Contribution
It provides an extensive empirical evaluation of recent penalisation methods versus traditional cross-validation for model tuning in regression tasks.
Findings
V-fold penalisation can outperform VFCV in certain scenarios.
VFCV may provide suboptimal risk estimates asymptotically.
A modified penalisation technique reduces estimation error.
Abstract
Model selection is a crucial issue in machine-learning and a wide variety of penalisation methods (with possibly data dependent complexity penalties) have recently been introduced for this purpose. However their empirical performance is generally not well documented in the literature. It is the goal of this paper to investigate to which extent such recent techniques can be successfully used for the tuning of both the regularisation and kernel parameters in support vector regression (SVR) and the complexity measure in regression trees (CART). This task is traditionally solved via V-fold cross-validation (VFCV), which gives efficient results for a reasonable computational cost. A disadvantage however of VFCV is that the procedure is known to provide an asymptotically suboptimal risk estimate as the number of examples tends to infinity. Recently, a penalisation procedure called V-fold…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Advanced Statistical Methods and Models · Statistical Methods and Bayesian Inference
