Profiling German Text Simplification with Interpretable Model-Fingerprints

Lars Kl\"oser; Mika Beele; Bodo Kraft

arXiv:2601.13050·cs.CL·January 21, 2026

Profiling German Text Simplification with Interpretable Model-Fingerprints

Lars Kl\"oser, Mika Beele, Bodo Kraft

PDF

Open Access

TL;DR

This paper presents the Simplification Profiler, a diagnostic toolkit that creates interpretable fingerprints of text simplification models, enabling detailed analysis of model behavior and differences across configurations without relying on extensive human ratings.

Contribution

The paper introduces a novel, interpretable fingerprinting approach for analyzing LLM-based text simplification models, especially useful in data-scarce, diverse language contexts.

Findings

01

Fingerprint features can classify model configurations with up to 71.9% accuracy.

02

The profiler distinguishes between prompting strategies and prompt engineering effects.

03

The approach outperforms simple baselines by over 48 percentage points.

Abstract

While Large Language Models (LLMs) produce highly nuanced text simplifications, developers currently lack tools for a holistic, efficient, and reproducible diagnosis of their behavior. This paper introduces the Simplification Profiler, a diagnostic toolkit that generates a multidimensional, interpretable fingerprint of simplified texts. Multiple aggregated simplifications of a model result in a model's fingerprint. This novel evaluation paradigm is particularly vital for languages, where the data scarcity problem is magnified when creating flexible models for diverse target groups rather than a single, fixed simplification style. We propose that measuring a model's unique behavioral signature is more relevant in this context as an alternative to correlating metrics with human preferences. We operationalize this with a practical meta-evaluation of our fingerprints' descriptive power,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Authorship Attribution and Profiling · Topic Modeling