Studying Disinformation Narratives on Social Media with LLMs and Semantic Similarity
Chaytan Inman

TL;DR
This thesis introduces tools leveraging large language models and semantic similarity to detect, trace, and analyze disinformation narratives on social media, validated through case studies on political and social issues.
Contribution
It develops a novel continuous similarity measurement tool and a narrative clustering method, integrated into a dashboard for nuanced disinformation analysis.
Findings
Semantic similarity effectively detects nuanced disinformation.
Tools validated on benchmark and real-world datasets.
Case studies reveal detailed disinformation narrative structures.
Abstract
This thesis develops a continuous scale measurement of similarity to disinformation narratives that can serve to detect disinformation and capture the nuanced, partial truths that are characteristic of it. To do so, two tools are developed and their methodologies are documented. The tracing tool takes tweets and a target narrative, rates the similarities of each to the target narrative, and graphs it as a timeline. The second narrative synthesis tool clusters tweets above a similarity threshold and generates the dominant narratives within each cluster. These tools are combined into a Tweet Narrative Analysis Dashboard. The tracing tool is validated on the GLUE STS-B benchmark, and then the two tools are used to analyze two case studies for further empirical validation. The first case study uses the target narrative "The 2020 election was stolen" and analyzes a dataset of Donald Trump's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Computational and Text Analysis Methods · Media Influence and Politics
