Model-Agnostic Sentiment Distribution Stability Analysis for Robust LLM-Generated Texts Detection

Siyuan Li; Xi Lin; Guangyan Li; Zehao Liu; Aodu Wulianghai; Li Ding; Jun Wu; Jianhua Li

arXiv:2508.06913·cs.CL·August 12, 2025

Model-Agnostic Sentiment Distribution Stability Analysis for Robust LLM-Generated Texts Detection

Siyuan Li, Xi Lin, Guangyan Li, Zehao Liu, Aodu Wulianghai, Li Ding, Jun Wu, Jianhua Li

PDF

Open Access

TL;DR

This paper introduces SentiDetect, a model-agnostic framework that detects AI-generated texts by analyzing sentiment stability, outperforming existing methods especially under adversarial and paraphrased conditions.

Contribution

The paper presents a novel sentiment distribution stability approach for LLM detection, demonstrating improved robustness and generalizability over prior lexical and classifier-based methods.

Findings

01

SentiDetect outperforms state-of-the-art baselines in F1 scores.

02

It shows increased robustness to paraphrasing and adversarial attacks.

03

Effective across diverse datasets and multiple LLMs.

Abstract

The rapid advancement of large language models (LLMs) has resulted in increasingly sophisticated AI-generated content, posing significant challenges in distinguishing LLM-generated text from human-written language. Existing detection methods, primarily based on lexical heuristics or fine-tuned classifiers, often suffer from limited generalizability and are vulnerable to paraphrasing, adversarial perturbations, and cross-domain shifts. In this work, we propose SentiDetect, a model-agnostic framework for detecting LLM-generated text by analyzing the divergence in sentiment distribution stability. Our method is motivated by the empirical observation that LLM outputs tend to exhibit emotionally consistent patterns, whereas human-written texts display greater emotional variability. To capture this phenomenon, we define two complementary metrics: sentiment distribution consistency and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Misinformation and Its Impacts · Topic Modeling