Where Does Authorship Signal Emerge in Encoder-Based Language Models?
Francis Kulumba, Guillaume Vimont, Laurent Romary, Florian Cafiero

TL;DR
This paper investigates where authorship signals emerge in encoder-based language models, revealing that the scorer's design influences the layer at which authorship information is consolidated.
Contribution
It demonstrates that the scoring mechanism, not representation quality, determines the emergence of authorship signals in models.
Findings
Authorship signal availability is consistent across layers and models.
The scorer's causal role influences the layer of signal consolidation.
Different scorers exhibit distinct gradient structures and learning trajectories.
Abstract
Authorship attribution models fine-tuned with the same pretrained encoder, data, and loss can differ four-fold in performance depending only on their scoring mechanism. We use mechanistic interpretability tools to explain this gap. Stylistic features such as word length, punctuation density, and function-word frequency are equally available at every layer in every model, including in an off-the-shelf control encoder, hence the gap not coming from representation quality. Instead, causal intervention shows that the scorer determines where the encoder consolidates authorship signal. Mean pooling forces consolidation by early to mid layers, while late interaction defers it to later layers. We further derive this difference from the gradient structure of each scorer, and training dynamics reveal distinct learning trajectories that follow from that difference.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Madjakul/deep-stylometry-modernbert-meanmodel
- 🤗Madjakul/deep-stylometry-modernbert-late-interactionmodel
- 🤗Madjakul/deep-stylometry-modernbert-layerwisemodel
- 🤗Madjakul/deep-stylometry-modernbert-pli-wholewordmodel
- 🤗Madjakul/deep-stylometry-modernbert-pli-ngram2model
- 🤗Madjakul/deep-stylometry-modernbert-pli-ngram3model
- 🤗Madjakul/deep-stylometry-modernbert-pli-ngram4model
- 🤗Madjakul/deep-stylometry-modernbert-pli-ngram5model
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
