BTPD: A Multilingual Hand-curated Dataset of Bengali Transnational Political Discourse Across Online Communities
Dipto Das, Syed Ishtiaque Ahmed, Shion Guha

TL;DR
This paper introduces BTPD, a comprehensive multilingual dataset of Bengali transnational political discourse from online communities, addressing resource gaps in under-represented languages for political analysis.
Contribution
It presents a novel hand-curated dataset of Bengali political discourse across multiple online platforms, including collection methodology and content overview.
Findings
Dataset covers diverse community structures and interaction dynamics
Provides insights into topics and multilingual content in Bengali political discourse
Addresses resource scarcity in under-resourced language analysis
Abstract
Understanding political discourse in online spaces is crucial for analyzing public opinion and ideological polarization. While social computing and computational linguistics have explored such discussions in English, such research efforts are significantly limited in major yet under-resourced languages like Bengali due to the unavailability of datasets. In this paper, we present a multilingual dataset of Bengali transnational political discourse (BTPD) collected from three online platforms, each representing distinct community structures and interaction dynamics. Besides describing how we hand-curated the dataset through community-informed keyword-based retrieval, this paper also provides a general overview of its topics and multilingual content.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Complex Network Analysis Techniques · ICT in Developing Communities
