One2set + Large Language Model: Best Partners for Keyphrase Generation

Liangying Shao; Liang Zhang; Minlong Peng; Guoqi Ma; Hao Yue; Mingming; Sun; Jinsong Su

arXiv:2410.03421·cs.CL·October 22, 2024

One2set + Large Language Model: Best Partners for Keyphrase Generation

Liangying Shao, Liang Zhang, Minlong Peng, Guoqi Ma, Hao Yue, Mingming, Sun, Jinsong Su

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper proposes a generate-then-select framework combining a one2set model and large language models for keyphrase generation, significantly improving performance especially on absent keyphrases.

Contribution

It introduces a novel generate-then-select approach with optimal transport assignment and sequence labeling, enhancing keyphrase generation accuracy.

Findings

01

Outperforms state-of-the-art models on multiple benchmarks.

02

Improves absent keyphrase prediction significantly.

03

Addresses improper supervision and redundant selection issues.

Abstract

Keyphrase generation (KPG) aims to automatically generate a collection of phrases representing the core concepts of a given document. The dominant paradigms in KPG include one2seq and one2set. Recently, there has been increasing interest in applying large language models (LLMs) to KPG. Our preliminary experiments reveal that it is challenging for a single model to excel in both recall and precision. Further analysis shows that: 1) the one2set paradigm owns the advantage of high recall, but suffers from improper assignments of supervision signals during training; 2) LLMs are powerful in keyphrase selection, but existing selection methods often make redundant selections. Given these observations, we introduce a generate-then-select framework decomposing KPG into two steps, where we adopt a one2set-based model as generator to produce candidates and then use an LLM as selector to select…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

deeplearnxmu/kpg-setllm
pytorchOfficial

Videos

One2Set + Large Language Model: Best Partners for Keyphrase Generation· underline

Taxonomy

TopicsAdvanced Text Analysis Techniques