General-to-Specific Transfer Labeling for Domain Adaptable Keyphrase Generation
Rui Meng, Tong Wang, Xingdi Yuan, Yingbo Zhou, Daqing He

TL;DR
This paper introduces a three-stage transfer learning pipeline for keyphrase generation that effectively adapts models across domains with limited annotated data, improving quality and transferability.
Contribution
It proposes a novel three-stage domain adaptation process combining domain-general pre-training, transfer labeling, and fine-tuning, enhancing cross-domain keyphrase generation.
Findings
The method produces high-quality keyphrases in new domains.
It achieves consistent improvements with limited in-domain data.
The approach effectively mitigates domain shift issues.
Abstract
Training keyphrase generation (KPG) models require a large amount of annotated data, which can be prohibitively expensive and often limited to specific domains. In this study, we first demonstrate that large distribution shifts among different domains severely hinder the transferability of KPG models. We then propose a three-stage pipeline, which gradually guides KPG models' learning focus from general syntactical features to domain-related semantics, in a data-efficient manner. With Domain-general Phrase pre-training, we pre-train Sequence-to-Sequence models with generic phrase annotations that are widely available on the web, which enables the models to generate phrases in a wide range of domains. The resulting model is then applied in the Transfer Labeling stage to produce domain-specific pseudo keyphrases, which help adapt models to a new domain. Finally, we fine-tune the model with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Text and Document Classification Technologies
