Teaching LLMs Human-Like Editing of Inappropriate Argumentation via Reinforcement Learning

Timon Ziegenbein; Maja Stahl; Henning Wachsmuth

arXiv:2604.12770·cs.CL·April 15, 2026

Teaching LLMs Human-Like Editing of Inappropriate Argumentation via Reinforcement Learning

Timon Ziegenbein, Maja Stahl, Henning Wachsmuth

PDF

TL;DR

This paper introduces a reinforcement learning method to teach large language models to perform human-like, self-contained edits on argumentative text, improving appropriateness while preserving meaning.

Contribution

The paper presents a novel RL approach that trains LLMs to generate independent, meaning-preserving edits aligned with human editing strategies for argumentation.

Findings

01

Outperforms existing methods in human-like editing quality

02

Achieves argument appropriateness close to full rewriting in multi-round edits

03

Produces self-contained, independent sentence-level edit suggestions

Abstract

Editing human-written text has become a standard use case of large language models (LLMs), for example, to make one's arguments more appropriate for a discussion. Comparing human to LLM-generated edits, however, we observe a mismatch in editing strategies: While LLMs often perform multiple scattered edits and tend to change meaning notably, humans rather encapsulate dependent changes in self-contained, meaning-preserving edits. In this paper, we present a reinforcement learning approach that teaches LLMs human-like editing to improve the appropriateness of arguments. Our approach produces self-contained sentence-level edit suggestions that can be accepted or rejected independently. We train the approach using group relative policy optimization with a multi-component reward function that jointly optimizes edit-level semantic similarity, fluency, and pattern conformity as well as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.