Exploring the Design of Adaptation Protocols for Improved Generalization   and Machine Learning Safety

Puja Trivedi; Danai Koutra; Jayaraman J. Thiagarajan

arXiv:2207.12615·cs.LG·July 27, 2022

Exploring the Design of Adaptation Protocols for Improved Generalization and Machine Learning Safety

Puja Trivedi, Danai Koutra, Jayaraman J. Thiagarajan

PDF

Open Access

TL;DR

This paper investigates how different adaptation protocols for large-scale pretrained models affect out-of-distribution generalization and safety metrics, revealing trade-offs and proposing strategies to mitigate them through data augmentation.

Contribution

It systematically evaluates adaptation protocols across various distribution shifts and safety metrics, highlighting their trade-offs and proposing augmentation-based methods for improvement.

Findings

01

Protocols induce different trade-offs in generalization and safety.

02

Pairing data augmentation with protocols can reduce trade-offs.

03

Hardness-promoting augmentations during adaptation improve robustness.

Abstract

While directly fine-tuning (FT) large-scale, pretrained models on task-specific data is well-known to induce strong in-distribution task performance, recent works have demonstrated that different adaptation protocols, such as linear probing (LP) prior to FT, can improve out-of-distribution generalization. However, the design space of such adaptation protocols remains under-explored and the evaluation of such protocols has primarily focused on distribution shifts. Therefore, in this work, we evaluate common adaptation protocols across distributions shifts and machine learning safety metrics (e.g., anomaly detection, calibration, robustness to corruptions). We find that protocols induce disparate trade-offs that were not apparent from prior evaluation. Further, we demonstrate that appropriate pairing of data augmentation and protocol can substantially mitigate this trade-off. Finally, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)