HYBRINFOX at CheckThat! 2024 -- Task 1: Enhancing Language Models with Structured Information for Check-Worthiness Estimation
G\'eraud Faye, Morgane Casanova, Benjamin Icard, Julien Chanson,, Guillaume Gadek, Guillaume Gravier, Paul \'Egr\'e

TL;DR
This paper presents a method that enhances language models with structured triple embeddings to improve check-worthiness estimation, showing promising results especially in English, and discusses future adaptations to larger models.
Contribution
The paper introduces a novel approach combining language models with structured triple embeddings to improve check-worthiness detection performance.
Findings
Best F1 score of 71.1 in English
Improved performance over language models alone
Mixed results in Dutch and Arabic
Abstract
This paper summarizes the experiments and results of the HYBRINFOX team for the CheckThat! 2024 - Task 1 competition. We propose an approach enriching Language Models such as RoBERTa with embeddings produced by triples (subject ; predicate ; object) extracted from the text sentences. Our analysis of the developmental data shows that this method improves the performance of Language Models alone. On the evaluation data, its best performance was in English, where it achieved an F1 score of 71.1 and ranked 12th out of 27 candidates. On the other languages (Dutch and Arabic), it obtained more mixed results. Future research tracks are identified toward adapting this processing pipeline to more recent Large Language Models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Software Engineering Research · Software Reliability and Analysis Research
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Multi-Head Attention · Softmax · WordPiece · Residual Connection · Layer Normalization · Attention Dropout · Linear Warmup With Linear Decay
