Learning to Reason for Multi-Step Retrieval of Personal Context in Personalized Question Answering

Maryam Amirizaniani; Alireza Salemi; Hamed Zamani

arXiv:2602.19317·cs.CL·February 24, 2026

Learning to Reason for Multi-Step Retrieval of Personal Context in Personalized Question Answering

Maryam Amirizaniani, Alireza Salemi, Hamed Zamani

PDF

Open Access

TL;DR

The paper introduces PR2, a reinforcement learning framework that enhances personalized question answering by learning when and what to retrieve from user profiles for better reasoning and alignment.

Contribution

PR2 is a novel reinforcement learning approach that optimizes retrieval and reasoning policies for personalized QA, improving over existing retrieval-augmented methods.

Findings

01

PR2 outperforms baselines with 8.8%-12% improvement on LaMP-QA.

02

Adaptive retrieval-reasoning policies enhance personalization accuracy.

03

Multi-turn reasoning trajectories are optimized for user-specific preferences.

Abstract

Personalization in Question Answering (QA) requires answers that are both accurate and aligned with users' background, preferences, and historical context. Existing state-of-the-art methods primarily rely on retrieval-augmented generation (RAG) solutions that construct personal context by retrieving relevant items from the user's profile. Existing methods use the user's query directly to retrieve personal documents, and such strategies often lead to surface-level personalization. We propose PR2 (Personalized Retrieval-Augmented Reasoning), a reinforcement learning framework that integrates reasoning and retrieval from personal context for personalization. PR2 learns adaptive retrieval-reasoning policies, determining when to retrieve, what evidence to retrieve from user profiles, and how to incorporate it into intermediate reasoning steps. By optimizing multi-turn reasoning trajectories…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInformation Retrieval and Search Behavior · Expert finding and Q&A systems · Topic Modeling