Domain Adaptation from Scratch
Eyal Ben-David, Yftah Ziser, Roi Reichart

TL;DR
This paper introduces a new setup called 'domain adaptation from scratch' for NLP, focusing on adapting models to sensitive target domains without direct data, and compares various methods to improve performance.
Contribution
It proposes a novel learning setup for privacy-preserving domain adaptation and evaluates multiple approaches to address the domain gap in NLP tasks.
Findings
Data selection and adaptation methods reduce domain gap.
Combining approaches further improves NLP task performance.
Approaches are effective for sentiment analysis and NER.
Abstract
Natural language processing (NLP) algorithms are rapidly improving but often struggle when applied to out-of-distribution examples. A prominent approach to mitigate the domain gap is domain adaptation, where a model trained on a source domain is adapted to a new target domain. We present a new learning setup, ``domain adaptation from scratch'', which we believe to be crucial for extending the reach of NLP to sensitive domains in a privacy-preserving manner. In this setup, we aim to efficiently annotate data from a set of source domains such that the trained model performs well on a sensitive target domain from which data is unavailable for annotation. Our study compares several approaches for this challenging setup, ranging from data selection and domain adaptation algorithms to active learning paradigms, on two NLP tasks: sentiment analysis and Named Entity Recognition. Our results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Machine Learning in Healthcare
