ControBench: An Interaction-Aware Benchmark for Controversial Discourse Analysis on Social Networks

Ta Thanh Thuy; Jiaqi Zhu; Xuan Liu; Lin Shang; Reihaneh Rabbany; Guillaume Rabusseau; Lihui Chen; Zheng Yilun; Sitao Luan

arXiv:2605.00513·cs.CL·May 4, 2026

ControBench: An Interaction-Aware Benchmark for Controversial Discourse Analysis on Social Networks

Ta Thanh Thuy, Jiaqi Zhu, Xuan Liu, Lin Shang, Reihaneh Rabbany, Guillaume Rabusseau, Lihui Chen, Zheng Yilun, Sitao Luan

PDF

TL;DR

ControBench is a comprehensive benchmark combining social interaction graphs and rich text data from Reddit to analyze controversial online discourse across ideological divides.

Contribution

It introduces a novel dataset integrating interaction structure with semantic content and ideological labels, enabling advanced analysis of online debates.

Findings

01

Graph neural networks and language models show varied performance across topics.

02

The dataset exhibits low or negative homophily, indicating cross-cutting ideological interactions.

03

Models perform differently when ideological boundaries are ambiguous.

Abstract

Understanding how people argue across ideological divides online is important for studying political polarization, misinformation, and content moderation. Existing datasets capture only part of this problem: some preserve text but ignore interaction structure, some model structure without rich semantics, and others represent conversations without stable user-level ideological identity. We introduce ControBench, a benchmark for controversial discourse analysis that combines heterogeneous social interaction graphs with rich textual semantics. Built from Reddit discussions on three topics, Trump, abortion, and religion, ControBench contains 7,370 users, 1,783 posts, and 26,525 interactions. The graph contains user and post nodes connected by semantically enriched edges; in particular, user-comment-user edges encode both a reply and the parent comment that it responds to, preserving local…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.