Robustness, Evaluation and Adaptation of Machine Learning Models in the   Wild

Vihari Piratla

arXiv:2303.02781·cs.LG·March 7, 2023·1 cites

Robustness, Evaluation and Adaptation of Machine Learning Models in the Wild

Vihari Piratla

PDF

Open Access

TL;DR

This paper addresses the challenge of deploying reliable machine learning models in real-world scenarios with distribution shifts by proposing algorithms for robustness, evaluation, and adaptation, including domain generalization and label-efficient performance forecasting.

Contribution

It introduces new training algorithms to enhance domain robustness, methods for estimating accuracy under distribution shifts, and lightweight adaptation techniques using unlabeled data.

Findings

01

Improved robustness over standard training in certain settings

02

Proposed accuracy estimation method for distribution shifts

03

Explored lightweight adaptation with unlabeled data in language tasks

Abstract

Our goal is to improve reliability of Machine Learning (ML) systems deployed in the wild. ML models perform exceedingly well when test examples are similar to train examples. However, real-world applications are required to perform on any distribution of test examples. Current ML systems can fail silently on test examples with distribution shifts. In order to improve reliability of ML models due to covariate or domain shift, we propose algorithms that enable models to: (a) generalize to a larger family of test distributions, (b) evaluate accuracy under distribution shifts, (c) adapt to a target distribution. We study causes of impaired robustness to domain shifts and present algorithms for training domain robust models. A key source of model brittleness is due to domain overfitting, which our new training algorithms suppress and instead encourage domain-general hypotheses. While we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Software Reliability and Analysis Research · Software Engineering Research

Methodsfail · Test