Psychologically Potent, Computationally Invisible: LLMs Generate Social-Comparison Triggers They Fail to Detect
Hua Zhao, Jiapei Gu, Michelle Mingyue Gu

TL;DR
This paper presents XHS-SCoRE, a benchmark for detecting social comparison triggers in Xiaohongshu posts, revealing challenges in prompt-based detection despite textual learnability and introducing a diagnostic framework.
Contribution
It introduces a novel benchmark and diagnostic framework for understanding the detectability of social comparison cues in social media texts by LLMs.
Findings
LLMs can learn the comparison signal in-domain but struggle with robust detection.
Prompt-based classifiers show stable failure modes, especially neutralization.
Generated posts can influence perceived social standing despite detection fragility.
Abstract
We introduce Xiaohongshu Social Comparison Reader Elicitation (XHS-SCoRE), a reader-grounded benchmark for detecting if a text-only Xiaohongshu (RedNote) post elicits UPWARD, DOWNWARD, or NEUTRAL/no clear social comparison from a first-person reader perspective. The task targets a socially meaningful relational signal that is behaviorally real yet not reducible to sentiment. Across prompted LLM classifiers and supervised Chinese encoder baselines, we find a consistent mismatch between generation fluency and reliable detection ability: the signal is textually learnable in-domain, but not robustly accessible to prompt-based classification. Prompted LLM classifiers exhibit stable, interpretable failure modes, especially neutralization of comparison-triggering posts and model-specific directional skew. A controlled pilot further shows that LLM-generated Xiaohongshu-style posts can shift…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
