On Leveraging Encoder-only Pre-trained Language Models for Effective Keyphrase Generation
Di Wu, Wasi Uddin Ahmad, Kai-Wei Chang

TL;DR
This paper explores the use of encoder-only pre-trained language models for keyphrase generation, demonstrating their effectiveness and proposing optimal architectural strategies, with findings showing their competitive performance and data efficiency.
Contribution
It investigates encoder-only PLMs for KPG, compares architectures, and highlights prefix-LM fine-tuning as a superior, data-efficient approach over traditional seq2seq models.
Findings
Encoder-only PLMs achieve broader keyphrase predictions.
Prefix-LM fine-tuning outperforms seq2seq PLMs.
Deeper models are more effective than wider ones.
Abstract
This study addresses the application of encoder-only Pre-trained Language Models (PLMs) in keyphrase generation (KPG) amidst the broader availability of domain-tailored encoder-only models compared to encoder-decoder models. We investigate three core inquiries: (1) the efficacy of encoder-only PLMs in KPG, (2) optimal architectural decisions for employing encoder-only PLMs in KPG, and (3) a performance comparison between in-domain encoder-only and encoder-decoder PLMs across varied resource settings. Our findings, derived from extensive experimentation in two domains reveal that with encoder-only PLMs, although KPE with Conditional Random Fields slightly excels in identifying present keyphrases, the KPG formulation renders a broader spectrum of keyphrase predictions. Additionally, prefix-LM fine-tuning of encoder-only PLMs emerges as a strong and data-efficient strategy for KPG,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Keypoint Pose Encoding · Sequence to Sequence
