TOPFORMER: Topology-Aware Authorship Attribution of Deepfake Texts with Diverse Writing Styles
Adaku Uchendu, Thai Le, Dongwon Lee

TL;DR
TopFormer is a novel topology-aware transformer model that enhances authorship attribution of deepfake texts by integrating topological data analysis, significantly improving detection accuracy across diverse datasets.
Contribution
The paper introduces TopFormer, a transformer model with a TDA layer that captures linguistic structures, advancing deepfake authorship attribution methods.
Findings
TopFormer outperforms baseline models with up to 7% higher Macro F1 score.
TDA features improve performance on imbalanced and multi-style datasets.
Incorporating TDA enhances the model's ability to detect diverse deepfake texts.
Abstract
Recent advances in Large Language Models (LLMs) have enabled the generation of open-ended high-quality texts, that are non-trivial to distinguish from human-written texts. We refer to such LLM-generated texts as deepfake texts. There are currently over 72K text generation models in the huggingface model repo. As such, users with malicious intent can easily use these open-sourced LLMs to generate harmful texts and dis/misinformation at scale. To mitigate this problem, a computational method to determine if a given text is a deepfake text or not is desired--i.e., Turing Test (TT). In particular, in this work, we investigate the more general version of the problem, known as Authorship Attribution (AA), in a multi-class setting--i.e., not only determining if a given text is a deepfake text or not but also being able to pinpoint which LLM is the author. We propose TopFormer to improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Layer Normalization · WordPiece · Dropout · Dense Connections · Linear Layer · Softmax · Linear Warmup With Linear Decay
