CAVE: Controllable Authorship Verification Explanations
Sahana Ramnath, Kartik Pandey, Elizabeth Boschee, Xiang Ren

TL;DR
CAVE is an offline authorship verification model that provides controllable, linguistically grounded explanations, improving interpretability and maintaining competitive accuracy in privacy-sensitive applications.
Contribution
The paper introduces CAVE, a novel offline AV model that generates controllable, linguistically grounded explanations and improves interpretability over existing methods.
Findings
CAVE produces high-quality, human-evaluable explanations.
CAVE achieves competitive accuracy on AV datasets.
The approach enhances interpretability in privacy-sensitive domains.
Abstract
Authorship Verification (AV) (do two documents have the same author?) is essential in many real-life applications. AV is often used in privacy-sensitive domains that require an offline proprietary model that is deployed on premises, making publicly served online models (APIs) a suboptimal choice. Current offline AV models however have lower downstream utility due to limited accuracy (eg: traditional stylometry AV systems) and lack of accessible post-hoc explanations. In this work, we address the above challenges by developing a trained, offline model CAVE (Controllable Authorship Verification Explanations). CAVE generates free-text AV explanations that are controlled to be (1) accessible (uniform structure that can be decomposed into sub-explanations grounded to relevant linguistic features), and (2) easily verified for explanation-label consistency. We generate silver-standard training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAuthorship Attribution and Profiling
