Exploring Fine-tuned Generative Models for Keyphrase Selection: A Case Study for Russian
Anna Glazkova, Dmitry Morozov

TL;DR
This study investigates the application of fine-tuned generative transformer models for keyphrase selection in Russian scientific texts, demonstrating notable in-domain improvements and potential for cross-domain adaptation.
Contribution
It introduces the use of multiple generative models like ruT5, ruGPT, mT5, and mBART for keyphrase selection in Russian, highlighting their performance gains over traditional extraction methods.
Findings
mBART achieved up to 4.9% in BERTScore improvement
In-domain performance improved significantly with generative models
Cross-domain results showed potential despite lower performance
Abstract
Keyphrase selection plays a pivotal role within the domain of scholarly texts, facilitating efficient information retrieval, summarization, and indexing. In this work, we explored how to apply fine-tuned generative transformer-based models to the specific task of keyphrase selection within Russian scientific texts. We experimented with four distinct generative models, such as ruT5, ruGPT, mT5, and mBART, and evaluated their performance in both in-domain and cross-domain settings. The experiments were conducted on the texts of Russian scientific abstracts from four domains: mathematics & computer science, history, medicine, and linguistics. The use of generative models, namely mBART, led to gains in in-domain performance (up to 4.9% in BERTScore, 9.0% in ROUGE-1, and 12.2% in F1-score) over three keyphrase extraction baselines for the Russian language. Although the results for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · Byte Pair Encoding · Gated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · SentencePiece · Softmax · Layer Normalization · Adafactor · Inverse Square Root Schedule · Dropout
