Voice for the Voiceless: Active Sampling to Detect Comments Supporting the Rohingyas
Shriphani Palakodety, Ashiqur R. KhudaBukhsh, Jaime G. Carbonell

TL;DR
This paper develops an AI-based active sampling method to identify supportive comments for the Rohingya community on YouTube, aiming to amplify marginalized voices and improve online safety.
Contribution
It introduces a novel combination of active learning strategies and a nearest-neighbors based sampling method for detecting supportive comments in social media data.
Findings
Constructed a large dataset of YouTube comments related to Rohingyas.
Developed a classifier that effectively detects defending comments.
Highlights potential for AI to support marginalized communities online.
Abstract
The Rohingya refugee crisis is one of the biggest humanitarian crises of modern times with more than 600,000 Rohingyas rendered homeless according to the United Nations High Commissioner for Refugees. While it has received sustained press attention globally, no comprehensive research has been performed on social media pertaining to this large evolving crisis. In this work, we construct a substantial corpus of YouTube video comments (263,482 comments from 113,250 users in 5,153 relevant videos) with an aim to analyze the possible role of AI in helping a marginalized community. Using a novel combination of multiple Active Learning strategies and a novel active sampling strategy based on nearest-neighbors in the comment-embedding space, we construct a classifier that can detect comments defending the Rohingyas among larger numbers of disparaging and neutral ones. We advocate that beyond…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
