CoCoLoFa: A Dataset of News Comments with Common Logical Fallacies Written by LLM-Assisted Crowds
Min-Hsuan Yeh, Ruyuan Wan, and Ting-Hao 'Kenneth' Huang

TL;DR
This paper presents CoCoLoFa, a large dataset of news comments annotated for logical fallacies, created through crowdsourcing aided by LLMs, and demonstrates its effectiveness for training fallacy detection models.
Contribution
Introduces the CoCoLoFa dataset, the largest of its kind, and shows how combining crowdsourcing with LLM assistance improves dataset quality for complex linguistic annotations.
Findings
BERT-based models trained on CoCoLoFa achieved high F1 scores (0.86 and 0.87).
The dataset was rated as high quality and reliable by experts.
Combining crowdsourcing with LLMs enhances dataset construction for complex tasks.
Abstract
Detecting logical fallacies in texts can help users spot argument flaws, but automating this detection is not easy. Manually annotating fallacies in large-scale, real-world text data to create datasets for developing and validating detection models is costly. This paper introduces CoCoLoFa, the largest known logical fallacy dataset, containing 7,706 comments for 648 news articles, with each comment labeled for fallacy presence and type. We recruited 143 crowd workers to write comments embodying specific fallacy types (e.g., slippery slope) in response to news articles. Recognizing the complexity of this writing task, we built an LLM-powered assistant into the workers' interface to aid in drafting and refining their comments. Experts rated the writing quality and labeling validity of CoCoLoFa as high and reliable. BERT-based models fine-tuned using CoCoLoFa achieved the highest fallacy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
