Scalable Detection of Salient Entities in News Articles

Eliyar Asgarieh; Kapil Thadani; Neil O'Hare

arXiv:2405.20461·cs.CL·June 3, 2024

Scalable Detection of Salient Entities in News Articles

Eliyar Asgarieh, Kapil Thadani, Neil O'Hare

PDF

Open Access

TL;DR

This paper presents new methods for efficiently detecting salient entities in news articles using fine-tuned transformer models, significantly improving accuracy and reducing computational costs through knowledge distillation.

Contribution

It introduces straightforward fine-tuning techniques for transformer models with entity tags or representations, outperforming prior methods in salient entity detection.

Findings

01

Transformer-based models outperform previous approaches.

02

Knowledge distillation reduces model complexity without accuracy loss.

03

Extensive analyses characterize model behavior.

Abstract

News articles typically mention numerous entities, a large fraction of which are tangential to the story. Detecting the salience of entities in articles is thus important to applications such as news search, analysis and summarization. In this work, we explore new approaches for efficient and effective salient entity detection by fine-tuning pretrained transformer models with classification heads that use entity tags or contextualized entity representations directly. Experiments show that these straightforward techniques dramatically outperform prior work across datasets with varying sizes and salience definitions. We also study knowledge distillation techniques to effectively reduce the computational cost of these models without affecting their accuracy. Finally, we conduct extensive analyses and ablation experiments to characterize the behavior of the proposed models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Topic Modeling · Sentiment Analysis and Opinion Mining

MethodsKnowledge Distillation