TL;DR
This paper investigates fake news detection tailored to South African news websites by curating local datasets, training models, and comparing them with international datasets to improve detection accuracy in local contexts.
Contribution
It introduces a curated South African fake news dataset and analyzes the effectiveness of detection models trained on local versus international data.
Findings
Local datasets improve detection accuracy for South African news.
Combining datasets increases diversity and model robustness.
Interpretable ML reveals writing differences across nations.
Abstract
Disinformation through fake news is an ongoing problem in our society and has become easily spread through social media. The most cost and time effective way to filter these large amounts of data is to use a combination of human and technical interventions to identify it. From a technical perspective, Natural Language Processing (NLP) is widely used in detecting fake news. Social media companies use NLP techniques to identify the fake news and warn their users, but fake news may still slip through undetected. It is especially a problem in more localised contexts (outside the United States of America). How do we adjust fake news detection systems to work better for local contexts such as in South Africa. In this work we investigate fake news detection on South African websites. We curate a dataset of South African fake news and then train detection models. We contrast this with using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
