Reasoning-Enhanced Self-Training for Long-Form Personalized Text   Generation

Alireza Salemi; Cheng Li; Mingyang Zhang; Qiaozhu Mei; Weize Kong; Tao; Chen; Zhuowan Li; Michael Bendersky; Hamed Zamani

arXiv:2501.04167·cs.CL·January 9, 2025

Reasoning-Enhanced Self-Training for Long-Form Personalized Text Generation

Alireza Salemi, Cheng Li, Mingyang Zhang, Qiaozhu Mei, Weize Kong, Tao, Chen, Zhuowan Li, Michael Bendersky, Hamed Zamani

PDF

Open Access

TL;DR

This paper introduces REST-PG, a novel framework that enhances personalized long-form text generation by training large language models to reason over user data, resulting in significant performance improvements.

Contribution

The paper proposes REST-PG, a reasoning-enhanced self-training method that improves LLMs' ability to generate personalized text by reasoning over user data and iterative self-training.

Findings

01

Achieves 14.5% average performance gain on LongLaMP benchmark.

02

Demonstrates effectiveness across four diverse personalized text tasks.

03

Outperforms state-of-the-art baselines in personalized text generation.

Abstract

Personalized text generation requires a unique ability of large language models (LLMs) to learn from context that they often do not encounter during their standard training. One way to encourage LLMs to better use personalized context for generating outputs that better align with the user's expectations is to instruct them to reason over the user's past preferences, background knowledge, or writing style. To achieve this, we propose Reasoning-Enhanced Self-Training for Personalized Text Generation (REST-PG), a framework that trains LLMs to reason over personal data during response generation. REST-PG first generates reasoning paths to train the LLM's reasoning abilities and then employs Expectation-Maximization Reinforced Self-Training to iteratively train the LLM based on its own high-reward outputs. We evaluate REST-PG on the LongLaMP benchmark, consisting of four diverse personalized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsALIGN