Doge Tickets: Uncovering Domain-general Language Models by Playing   Lottery Tickets

Yi Yang; Chen Zhang; Benyou Wang; Dawei Song

arXiv:2207.09638·cs.CL·September 20, 2022

Doge Tickets: Uncovering Domain-general Language Models by Playing Lottery Tickets

Yi Yang, Chen Zhang, Benyou Wang, Dawei Song

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method called doge tickets to identify domain-general parameters in language models, improving out-of-domain generalization by leveraging the lottery ticket hypothesis.

Contribution

It is the first to propose using lottery tickets to uncover domain-general parameters in pretrained language models for better domain transfer.

Findings

01

Doge tickets improve out-of-domain generalization.

02

Existence of domain-general parameters is supported.

03

Method outperforms several baselines.

Abstract

Over-parameterized models, typically pretrained language models (LMs), have shown an appealing expressive power due to their small learning bias. However, the huge learning capacity of LMs can also lead to large learning variance. In a pilot study, we find that, when faced with multiple domains, a critical portion of parameters behave unexpectedly in a domain-specific manner while others behave in a domain-general one. Motivated by this phenomenon, we for the first time posit that domain-general parameters can underpin a domain-general LM that can be derived from the original LM. To uncover the domain-general LM, we propose to identify domain-general parameters by playing lottery tickets (dubbed doge tickets). In order to intervene the lottery, we propose a domain-general score, which depicts how domain-invariant a parameter is by associating it with the variance. Comprehensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ylily1015/dogetickets
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis