Whodunit? Learning to Contrast for Authorship Attribution
Bo Ai, Yuchen Wang, Yugin Tan, Samson Tan

TL;DR
This paper introduces Contra-X, a contrastive learning approach that fine-tunes pre-trained language models to create highly separable author-specific text representations, significantly improving authorship attribution accuracy.
Contribution
It is the first to combine contrastive learning with pre-trained language models for authorship attribution, achieving state-of-the-art results across multiple benchmarks.
Findings
Achieves up to 6.8% accuracy improvement over traditional fine-tuning.
Learns highly separable clusters for different authors.
Improves overall accuracy but may reduce performance for some individual authors.
Abstract
Authorship attribution is the task of identifying the author of a given text. The key is finding representations that can differentiate between authors. Existing approaches typically use manually designed features that capture a dataset's content and style, but these approaches are dataset-dependent and yield inconsistent performance across corpora. In this work, we propose \textit{learning} author-specific representations by fine-tuning pre-trained generic language representations with a contrastive objective (Contra-X). We show that Contra-X learns representations that form highly separable clusters for different authors. It advances the state-of-the-art on multiple human and machine authorship attribution benchmarks, enabling improvements of up to 6.8% over cross-entropy fine-tuning. However, we find that Contra-X improves overall accuracy at the cost of sacrificing performance for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling
MethodsContrastive Learning
