Teaching LLMs Human-Like Editing of Inappropriate Argumentation via Reinforcement Learning
Timon Ziegenbein, Maja Stahl, Henning Wachsmuth

TL;DR
This paper introduces a reinforcement learning method to teach large language models to perform human-like, self-contained edits on argumentative text, improving appropriateness while preserving meaning.
Contribution
The paper presents a novel RL approach that trains LLMs to generate independent, meaning-preserving edits aligned with human editing strategies for argumentation.
Findings
Outperforms existing methods in human-like editing quality
Achieves argument appropriateness close to full rewriting in multi-round edits
Produces self-contained, independent sentence-level edit suggestions
Abstract
Editing human-written text has become a standard use case of large language models (LLMs), for example, to make one's arguments more appropriate for a discussion. Comparing human to LLM-generated edits, however, we observe a mismatch in editing strategies: While LLMs often perform multiple scattered edits and tend to change meaning notably, humans rather encapsulate dependent changes in self-contained, meaning-preserving edits. In this paper, we present a reinforcement learning approach that teaches LLMs human-like editing to improve the appropriateness of arguments. Our approach produces self-contained sentence-level edit suggestions that can be accepted or rejected independently. We train the approach using group relative policy optimization with a multi-component reward function that jointly optimizes edit-level semantic similarity, fluency, and pattern conformity as well as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
