GateNLP-UShef at SemEval-2022 Task 8: Entity-Enriched Siamese Transformer for Multilingual News Article Similarity
Iknoor Singh, Yue Li, Melissa Thong, Carolina Scarton

TL;DR
This paper presents a multilingual news article similarity system using an entity-enriched Siamese Transformer that captures narrative, entities, location, and time to assess how different outlets report the same events.
Contribution
It introduces an entity-enriched Siamese Transformer architecture that combines narrative and auxiliary features for improved news article similarity detection.
Findings
Achieved second place in SemEval-2022 Task 8 leaderboard.
Demonstrated effectiveness of combining narrative and entity features.
Validated the approach through detailed ablation studies.
Abstract
This paper describes the second-placed system on the leaderboard of SemEval-2022 Task 8: Multilingual News Article Similarity. We propose an entity-enriched Siamese Transformer which computes news article similarity based on different sub-dimensions, such as the shared narrative, entities, location and time of the event discussed in the news article. Our system exploits a Siamese network architecture using a Transformer encoder to learn document-level representations for the purpose of capturing the narrative together with the auxiliary entity-based features extracted from the news articles. The intuition behind using all these features together is to capture the similarity between news articles at different granularity levels and to assess the extent to which different news outlets write about "the same events". Our experimental results and detailed ablation study demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Softmax · Dense Connections · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Multi-Head Attention · Absolute Position Encodings · Dropout
