Don't stop me now: Rethinking Validation Criteria for Model Parameter Selection
Andrea Apicella, Francesco Isgr\`o, Andrea Pollastro, Roberto Prevete

TL;DR
This paper systematically investigates how different validation criteria, especially early stopping based on accuracy versus loss, impact the test performance of neural classifiers, revealing that loss-based criteria generally outperform accuracy-based early stopping.
Contribution
It provides a comprehensive empirical and statistical analysis showing that loss-based validation criteria are more reliable than accuracy-based early stopping for model selection.
Findings
Early stopping with validation accuracy performs worse than loss-based methods.
Loss-based validation criteria offer more stable and comparable test accuracy.
Single validation rules often underperform the best epoch performance.
Abstract
Despite the extensive literature on training loss functions, the evaluation of generalization on the validation set remains underexplored. In this work, we conduct a systematic empirical and statistical study of how the validation criterion used for model selection affects test performance in neural classifiers, with attention to early stopping. Using fully connected networks on standard benchmarks under -fold evaluation, we compare: (i) early stopping with patience and (ii) post-hoc selection over all epochs (i.e. no early stopping). Models are trained with cross-entropy, C-Loss, or PolyLoss; the model parameter selection on the validation set is made using accuracy or one of the three loss functions, each considered independently. Three main findings emerge. (1) Early stopping based on validation accuracy performs worst, consistently selecting checkpoints with lower test accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques
