Learning Interpretable Style Embeddings via Prompting LLMs

Ajay Patel; Delip Rao; Ansh Kothary; Kathleen McKeown; Chris; Callison-Burch

arXiv:2305.12696·cs.CL·October 11, 2023·1 cites

Learning Interpretable Style Embeddings via Prompting LLMs

Ajay Patel, Delip Rao, Ansh Kothary, Kathleen McKeown, Chris, Callison-Burch

PDF

Open Access

TL;DR

This paper introduces LISA embeddings, a novel approach using prompting to generate interpretable style representations from text, enabling more transparent stylometry analysis and authorship attribution.

Contribution

The paper presents a new prompting-based method to create large synthetic stylometry datasets and train human-interpretable style embeddings called LISA.

Findings

01

Generated a large synthetic stylometry dataset.

02

Created interpretable style embeddings that improve transparency.

03

Released resources for further research.

Abstract

Style representation learning builds content-independent representations of author style in text. Stylometry, the analysis of style in text, is often performed by expert forensic linguists and no large dataset of stylometric annotations exists for training. Current style representation learning uses neural methods to disentangle style from content to create style vectors, however, these approaches result in uninterpretable representations, complicating their usage in downstream applications like authorship attribution where auditing and explainability is critical. In this work, we use prompting to perform stylometry on a large number of texts to create a synthetic dataset and train human-interpretable style representations we call LISA embeddings. We release our synthetic stylometry dataset and our interpretable style models as resources.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Text Readability and Simplification · Natural Language Processing Techniques