Bot Meets Shortcut: How Can LLMs Aid in Handling Unknown Invariance OOD Scenarios?

Shiyan Zheng; Herun Wan; Minnan Luo; Junhang Huang

arXiv:2511.08455·cs.CL·March 24, 2026

Bot Meets Shortcut: How Can LLMs Aid in Handling Unknown Invariance OOD Scenarios?

Shiyan Zheng, Herun Wan, Minnan Luo, Junhang Huang

PDF

1 Video

TL;DR

This paper investigates how social bot detectors are vulnerable to shortcut learning where models rely on superficial cues, and proposes mitigation strategies using large language models and counterfactual data augmentation to improve robustness.

Contribution

The paper provides an in-depth analysis of shortcut learning in social bot detection and introduces mitigation strategies leveraging large language models and counterfactual data augmentation.

Findings

01

Baseline models' accuracy drops by 32% under shortcut scenarios.

02

Mitigation strategies improve performance by 56% on average.

03

Proposed methods address data and model-level vulnerabilities.

Abstract

While existing social bot detectors perform well on benchmarks, their robustness across diverse real-world scenarios remains limited due to unclear ground truth and varied misleading cues. In particular, the impact of shortcut learning, where models rely on spurious correlations instead of capturing causal task-relevant features, has received limited attention. To address this gap, we conduct an in-depth study to assess how detectors are influenced by potential shortcuts based on textual features, which are most susceptible to manipulation by social bots. We design a series of shortcut scenarios by constructing spurious associations between user labels and superficial textual cues to evaluate model robustness. Results show that shifts in irrelevant feature distributions significantly degrade social bot detector performance, with an average relative accuracy drop of 32\% in the baseline…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Bot Meets Shortcut: How Can LLMs Aid in Handling Unknown Invariance OOD Scenarios?· underline