ICPR 2024 Competition on Multilingual Claim-Span Identification
Soham Poddar, Biswajit Paul, Moumita Basu, Saptarshi Ghosh

TL;DR
This paper introduces a competition focused on identifying claim spans within social media posts in multiple languages, utilizing a new dataset and showcasing various participant solutions to advance claim detection methods.
Contribution
It presents a new multilingual dataset for claim span identification and provides an overview of innovative solutions developed by participants in the competition.
Findings
Effective models achieved high claim span detection accuracy.
Multilingual approaches improved claim identification across languages.
The dataset facilitated benchmarking of state-of-the-art methods.
Abstract
A lot of claims are made in social media posts, which may contain misinformation or fake news. Hence, it is crucial to identify claims as a first step towards claim verification. Given the huge number of social media posts, the task of identifying claims needs to be automated. This competition deals with the task of 'Claim Span Identification' in which, given a text, parts / spans that correspond to claims are to be identified. This task is more challenging than the traditional binary classification of text into claim or not-claim, and requires state-of-the-art methods in Pattern Recognition, Natural Language Processing and Machine Learning. For this competition, we used a newly developed dataset called HECSI containing about 8K posts in English and about 8K posts in Hindi with claim-spans marked by human annotators. This paper gives an overview of the competition, and the solutions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
