TADA: Efficient Task-Agnostic Domain Adaptation for Transformers

Chia-Chien Hung; Lukas Lange; Jannik Str\"otgen

arXiv:2305.12717·cs.CL·May 23, 2023·1 cites

TADA: Efficient Task-Agnostic Domain Adaptation for Transformers

Chia-Chien Hung, Lukas Lange, Jannik Str\"otgen

PDF

Open Access 1 Repo

TL;DR

TADA is a modular, parameter-efficient domain adaptation method for transformers that retrains embeddings and tokenizers, enabling effective multi-domain adaptation without additional parameters or complex training.

Contribution

Introduces TADA, a novel task-agnostic domain adaptation approach that retrains embeddings and tokenizers while freezing other parameters, improving efficiency and effectiveness.

Findings

01

TADA performs well across 14 domains and 4 downstream tasks.

02

It is more efficient than full pre-training and adapters.

03

TADA requires no additional parameters or complex training steps.

Abstract

Intermediate training of pre-trained transformer-based language models on domain-specific data leads to substantial gains for downstream tasks. To increase efficiency and prevent catastrophic forgetting alleviated from full domain-adaptive pre-training, approaches such as adapters have been developed. However, these require additional parameters for each layer, and are criticized for their limited expressiveness. In this work, we introduce TADA, a novel task-agnostic domain adaptation method which is modular, parameter-efficient, and thus, data-efficient. Within TADA, we retrain the embeddings to learn domain-aware input representations and tokenizers for the transformer encoder, while freezing all other parameters of the model. Then, task-specific fine-tuning is performed. We further conduct experiments with meta-embeddings and newly introduced meta-tokenizers, resulting in one model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

boschresearch/tada
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning