Transformers for Headline Selection for Russian News Clusters

Pavel Voropaev; Olga Sopilnyak

arXiv:2106.10487·cs.CL·June 22, 2021

Transformers for Headline Selection for Russian News Clusters

Pavel Voropaev, Olga Sopilnyak

PDF

Open Access 1 Repo

TL;DR

This paper investigates transformer-based models for selecting headlines in Russian news clusters, demonstrating that combined multilingual and monolingual approaches outperform individual models, achieving over 86% accuracy.

Contribution

It introduces a combined transformer approach for headline selection in Russian news, analyzing sentence embedding methods and ranking models, with superior performance.

Findings

01

Combined models outperform individual models.

02

Achieved 87.28% accuracy on public test set.

03

Analyzed various sentence embedding and ranking techniques.

Abstract

In this paper, we explore various multilingual and Russian pre-trained transformer-based models for the Dialogue Evaluation 2021 shared task on headline selection. Our experiments show that the combined approach is superior to individual multilingual and monolingual models. We present an analysis of a number of ways to obtain sentence embeddings and learn a ranking model on top of them. We achieve the result of 87.28% and 86.60% accuracy for the public and private test sets respectively.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sopilnyak/headline-selection
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems