Rethinking Model Selection and Decoding for Keyphrase Generation with   Pre-trained Sequence-to-Sequence Models

Di Wu; Wasi Uddin Ahmad; Kai-Wei Chang

arXiv:2310.06374·cs.CL·October 24, 2023·1 cites

Rethinking Model Selection and Decoding for Keyphrase Generation with Pre-trained Sequence-to-Sequence Models

Di Wu, Wasi Uddin Ahmad, Kai-Wei Chang

PDF

Open Access 1 Repo

TL;DR

This paper systematically analyzes how model selection and decoding strategies impact keyphrase generation with pre-trained seq2seq models, proposing a new decode-select algorithm that enhances performance.

Contribution

It provides a comprehensive analysis of model and decoding choices in PLM-based KPG and introduces DeSel, a likelihood-based decoding method that improves F1 scores.

Findings

01

Greedy search achieves high F1 but lower recall.

02

Increased model size and task adaptation have limited efficiency.

03

DeSel improves greedy search F1 by 4.7% on average.

Abstract

Keyphrase Generation (KPG) is a longstanding task in NLP with widespread applications. The advent of sequence-to-sequence (seq2seq) pre-trained language models (PLMs) has ushered in a transformative era for KPG, yielding promising performance improvements. However, many design decisions remain unexplored and are often made arbitrarily. This paper undertakes a systematic analysis of the influence of model selection and decoding strategies on PLM-based KPG. We begin by elucidating why seq2seq PLMs are apt for KPG, anchored by an attention-driven hypothesis. We then establish that conventional wisdom for selecting seq2seq PLMs lacks depth: (1) merely increasing model size or performing task-specific adaptation is not parameter-efficient; (2) although combining in-domain pre-training with task adaptation benefits KPG, it does partially hinder generalization. Regarding decoding, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

uclanlp/deepkpg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Topic Modeling · Natural Language Processing Techniques

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Sequence to Sequence