Using Pre-Training Can Improve Model Robustness and Uncertainty

Dan Hendrycks; Kimin Lee; Mantas Mazeika

arXiv:1901.09960·cs.LG·October 22, 2019·429 cites

Using Pre-Training Can Improve Model Robustness and Uncertainty

Dan Hendrycks, Kimin Lee, Mantas Mazeika

PDF

Open Access 1 Repo

TL;DR

Pre-training enhances model robustness and uncertainty estimation across various challenging scenarios, offering significant improvements even without task-specific adjustments, despite not always boosting traditional accuracy metrics.

Contribution

The paper demonstrates that pre-training notably improves robustness and uncertainty estimates, introduces adversarial pre-training, and highlights its importance beyond traditional performance metrics.

Findings

01

Pre-training yields large gains in adversarial robustness.

02

Pre-training improves uncertainty estimation and calibration.

03

Pre-training alone can surpass state-of-the-art in some robustness tasks.

Abstract

He et al. (2018) have called into question the utility of pre-training by showing that training from scratch can often yield similar performance to pre-training. We show that although pre-training may not improve performance on traditional classification metrics, it improves model robustness and uncertainty estimates. Through extensive experiments on adversarial examples, label corruption, class imbalance, out-of-distribution detection, and confidence calibration, we demonstrate large gains from pre-training and complementary effects with task-specific methods. We introduce adversarial pre-training and show approximately a 10% absolute improvement over the previous state-of-the-art in adversarial robustness. In some cases, using pre-training without task-specific methods also surpasses the state-of-the-art, highlighting the need for pre-training when evaluating future methods on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hendrycks/pre-training
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFault Detection and Control Systems