Neurobiber: Fast and Interpretable Stylistic Feature Extraction
Kenan Alkiek, Anna Wegmann, Jian Zhu, David Jurgens

TL;DR
Neurobiber is a transformer-based system that rapidly extracts interpretable stylistic features from texts, enabling large-scale stylistic analysis and authorship verification with high speed and competitive accuracy.
Contribution
It introduces Neurobiber, a fast, interpretable transformer-based tool for stylistic feature extraction based on Biber's MDA, with open-source implementation and broad applicability.
Findings
Up to 56 times faster than existing systems
Replicates classic MDA insights on the CORE corpus
Achieves competitive performance on PAN 2020 authorship verification
Abstract
Linguistic style is pivotal for understanding how texts convey meaning and fulfill communicative purposes, yet extracting detailed stylistic features at scale remains challenging. We present Neurobiber, a transformer-based system for fast, interpretable style profiling built on Biber's Multidimensional Analysis (MDA). Neurobiber predicts 96 Biber-style features from our open-source BiberPlus library (a Python toolkit that computes stylistic features and provides integrated analytics, e.g., PCA and factor analysis). Despite being up to 56 times faster than existing open source systems, Neurobiber replicates classic MDA insights on the CORE corpus and achieves competitive performance on the PAN 2020 authorship verification task without extensive retraining. Its efficient and interpretable representations readily integrate into downstream NLP pipelines, facilitating large-scale stylometric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Topic Modeling · Text Readability and Simplification
