PolitiSky24: U.S. Political Bluesky Dataset with User Stance Labels
Peyman Rostami, Vahid Rahimzadeh, Ali Adibi, Azadeh Shakery

TL;DR
PolitiSky24 is a novel, large-scale dataset for U.S. political stance detection on Bluesky, combining user histories, engagement data, and LLM-generated labels to facilitate holistic political analysis.
Contribution
It introduces the first user-level stance detection dataset for the 2024 U.S. election, created with an innovative pipeline using LLMs for transparent labeling.
Findings
Achieves 81% labeling accuracy with LLMs.
Provides comprehensive user posting histories and engagement metadata.
Enables holistic political stance analysis on emerging social platforms.
Abstract
Stance detection identifies the viewpoint expressed in text toward a specific target, such as a political figure. While previous datasets have focused primarily on tweet-level stances from established platforms, user-level stance resources, especially on emerging platforms like Bluesky remain scarce. User-level stance detection provides a more holistic view by considering a user's complete posting history rather than isolated posts. We present the first stance detection dataset for the 2024 U.S. presidential election, collected from Bluesky and centered on Kamala Harris and Donald Trump. The dataset comprises 16,044 user-target stance pairs enriched with engagement metadata, interaction graphs, and user posting histories. PolitiSky24 was created using a carefully evaluated pipeline combining advanced information retrieval and large language models, which generates stance labels with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Misinformation and Its Impacts · Computational and Text Analysis Methods
