Claim Detection in Biomedical Twitter Posts

Amelie W\"uhrl; Roman Klinger

arXiv:2104.11639·cs.CL·May 4, 2021

Claim Detection in Biomedical Twitter Posts

Amelie W\"uhrl, Roman Klinger

PDF

TL;DR

This paper introduces a new annotated corpus of biomedical tweets to automatically detect claims, revealing high claim density and demonstrating the challenges of identifying both explicit and implicit claims using baseline models.

Contribution

It provides the first annotated dataset of biomedical social media claims and evaluates baseline models for claim detection in this domain.

Findings

01

45% of biomedical tweets contain claims

02

Explicit claim detection is more accurate than implicit

03

Baseline models show moderate performance on claim detection

Abstract

Social media contains unfiltered and unique information, which is potentially of great value, but, in the case of misinformation, can also do great harm. With regards to biomedical topics, false information can be particularly dangerous. Methods of automatic fact-checking and fake news detection address this problem, but have not been applied to the biomedical domain in social media yet. We aim to fill this research gap and annotate a corpus of 1200 tweets for implicit and explicit biomedical claims (the latter also with span annotations for the claim phrase). With this corpus, which we sample to be related to COVID-19, measles, cystic fibrosis, and depression, we develop baseline models which detect tweets that contain a claim automatically. Our analyses reveal that biomedical tweets are densely populated with claims (45 % in a corpus sampled to contain 1200 tweets focused on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.