Assessing Deanonymization Risks with Stylometry-Assisted LLM Agent

Boyang Zhang; Yang Zhang

arXiv:2602.23079·cs.CL·February 27, 2026

Assessing Deanonymization Risks with Stylometry-Assisted LLM Agent

Boyang Zhang, Yang Zhang

PDF

Open Access

TL;DR

This paper presents SALA, a stylometry-assisted LLM analysis framework that assesses and mitigates deanonymization risks in textual data by combining stylometric features with LLM reasoning, demonstrating high accuracy and privacy-preserving rewriting strategies.

Contribution

Introduces SALA, an innovative LLM-based method integrating stylometry and reasoning for authorship attribution and privacy protection in textual data.

Findings

01

SALA achieves high inference accuracy on large-scale news datasets.

02

Augmenting SALA with a database improves robustness.

03

Recomposition strategies effectively reduce authorship identifiability.

Abstract

The rapid advancement of large language models (LLMs) has enabled powerful authorship inference capabilities, raising growing concerns about unintended deanonymization risks in textual data such as news articles. In this work, we introduce an LLM agent designed to evaluate and mitigate such risks through a structured, interpretable pipeline. Central to our framework is the proposed $SALA$ (Stylometry-Assisted LLM Analysis) method, which integrates quantitative stylometric features with LLM reasoning for robust and transparent authorship attribution. Experiments on large-scale news datasets demonstrate that $SALA$ , particularly when augmented with a database module, achieves high inference accuracy in various scenarios. Finally, we propose a guided recomposition strategy that leverages the agent's reasoning trace to generate rewriting prompts, effectively reducing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Topic Modeling · Hate Speech and Cyberbullying Detection