Understanding Self-Training for Gradual Domain Adaptation

Ananya Kumar; Tengyu Ma; Percy Liang

arXiv:2002.11361·cs.LG·February 27, 2020·19 cites

Understanding Self-Training for Gradual Domain Adaptation

Ananya Kumar, Tengyu Ma, Percy Liang

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper provides a theoretical analysis and practical insights into self-training for gradual domain adaptation, demonstrating improved accuracy on datasets with evolving data distributions.

Contribution

It introduces the first non-vacuous error bound for self-training under gradual shifts and highlights the importance of regularization and label sharpening.

Findings

01

Self-training performs well for small Wasserstein-infinity shifts.

02

Regularization and label sharpening are crucial even with infinite data.

03

Higher accuracies achieved on rotating MNIST and Portraits datasets.

Abstract

Machine learning systems must adapt to data distributions that evolve over time, in applications ranging from sensor networks and self-driving car perception modules to brain-machine interfaces. We consider gradual domain adaptation, where the goal is to adapt an initial classifier trained on a source domain given only unlabeled data that shifts gradually in distribution towards a target domain. We prove the first non-vacuous upper bound on the error of self-training with gradual shifts, under settings where directly adapting to the target domain can result in unbounded error. The theoretical analysis leads to algorithmic insights, highlighting that regularization and label sharpening are essential even when we have infinite data, and suggesting that self-training works particularly well for shifts with small Wasserstein-infinity distance. Leveraging the gradual shift structure leads to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Understanding Self-Training for Gradual Domain Adaptation· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and ELM · Stochastic Gradient Optimization Techniques

MethodsGradual Self-Training