XAM: Interactive Explainability for Authorship Attribution Models

Milad Alshomary; Anisha Bhatnagar; Peter Zeng; Smaranda Muresan; Owen Rambow; Kathleen McKeown

arXiv:2512.06924·cs.CL·December 9, 2025

XAM: Interactive Explainability for Authorship Attribution Models

Milad Alshomary, Anisha Bhatnagar, Peter Zeng, Smaranda Muresan, Owen Rambow, Kathleen McKeown

PDF

Open Access

TL;DR

XAM introduces an interactive framework that allows users to explore and explain authorship attribution models by examining writing style features at multiple levels, enhancing interpretability over static explanations.

Contribution

The paper presents IXAM, a novel interactive explainability tool for embedding-based authorship attribution models, enabling dynamic exploration of stylistic features.

Findings

01

User evaluation shows IXAM improves interpretability.

02

IXAM provides more detailed explanations than static methods.

03

Framework enhances understanding of model predictions.

Abstract

We present IXAM, an Interactive eXplainability framework for Authorship Attribution Models. Given an authorship attribution (AA) task and an embedding-based AA model, our tool enables users to interactively explore the model's embedding space and construct an explanation of the model's prediction as a set of writing style features at different levels of granularity. Through a user evaluation, we demonstrate the value of our framework compared to predefined stylistic explanations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Topic Modeling · Hate Speech and Cyberbullying Detection