Large Language Models and Forensic Linguistics: Navigating Opportunities and Threats in the Age of Generative AI
George Mikros

TL;DR
Large language models offer new forensic tools but also pose challenges by mimicking styles and generating synthetic texts, requiring forensic linguistics to adapt methods for legal and scientific credibility.
Contribution
The paper analyzes the dual impact of LLMs on forensic linguistics, highlighting limitations of current detection methods and proposing methodological adaptations.
Findings
LLMs can approximate stylistic features but differ from human writers.
Current AI-text detection methods face high false positives and adversarial vulnerabilities.
Forensic linguistics must adopt hybrid workflows and explainable detection to remain credible.
Abstract
Large language models (LLMs) present a dual challenge for forensic linguistics. They serve as powerful analytical tools enabling scalable corpus analysis and embedding-based authorship attribution, while simultaneously destabilising foundational assumptions about idiolect through style mimicry, authorship obfuscation, and the proliferation of synthetic texts. Recent stylometric research indicates that LLMs can approximate surface stylistic features yet exhibit detectable differences from human writers, a tension with significant forensic implications. However, current AI-text detection techniques, whether classifier-based, stylometric, or watermarking approaches, face substantial limitations: high false positive rates for non-native English writers and vulnerability to adversarial strategies such as homoglyph substitution. These uncertainties raise concerns under legal admissibility…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Hate Speech and Cyberbullying Detection · Computational and Text Analysis Methods
