RPM: Reasoning-Level Personalization for Black-Box Large Language Models
Jieyong Kim, Tongyoung Kim, Soojin Yoon, Jaehyung Kim, Dongha Lee

TL;DR
This paper introduces RPM, a novel framework for black-box large language models that personalizes responses by modeling user-specific reasoning structures from behavioral data, improving both accuracy and interpretability.
Contribution
RPM is the first systematic approach to discover user reasoning paths and guide inference, moving beyond response-level personalization to reasoning-level personalization.
Findings
RPM outperforms existing methods across four tasks.
It enhances personalization performance.
It improves interpretability of model responses.
Abstract
While black-box large language models are widely deployed, they produce generic outputs that overlook individual user preferences. Current personalization methods are fundamentally limited to response-level personalization; they only match final outputs, failing to model the underlying reasoning that connects user behavior to responses. To address this, this work introduces reasoning-level personalization as a new paradigm and proposes RPM, the first systematic framework that automatically discovers user-specific reasoning structures from raw behavioral data to guide the model's personalized inference. RPM constructs a structured model of user behavior-built from response-influential features and statistical factors-to create personalized reasoning paths and retrieve beneficial examples for guiding inference through a feature-based retrieval mechanism. Extensive experiments across four…
Peer Reviews
Decision·ICLR 2026 Poster
(1) This study introduces structured reasoning in LLM generation (which may be quite common nowadays) to the personalization task. This attempt is original. (2) Extensive experiments show that the proposed structure benefits the personalization task more compared to other simpler prompting, RAG-based baselines and two personalization-focus baselines. (3) The method description is clear.
(1) The novelty seems to lie in the application of a common method to a specific task (i.e., personalization), which can be limited. (2) More baselines are necessary to support the claim that the proposed structured prompting framework is better. There have been many studies that give LLMs explicit structure for reasoning and are not compared in the current manuscript. For example, least-to-most prompting which decompose a hard problem into ordered subproblems and solve them sequentially. This
S1. The paper is well written and addresses an important gap in LLM personalization which is the reasoning step. S2. The evaluation seems to be comprehensive and results show the effectiveness of the technique. S3. The framework is prompting compatible and cost efficient.
W1: The set of published baseline is rather limitted - it would be good to see more baselines to understand where the technique stands. W2. The paper does not evaluate generailizablity across various choices of LLM from small to large and different architectures. W3. The human evaluation in Sec 4.4 compares RPM against baselines with CoT prompting, but there is no analysis of whether raters might confuse longer, more detailed explanations with genuinely better reasoning. W4. No length-nor
- Well-specified method with prompts provided for each step. - Includes latency and cost analyses (online and offline). - Explicit feature–factor–reasoning structure facilitates diagnosis and analysis. - The experimental evaluation is comprehensive.
- Extraction/clustering robustness: Reliance on prompt-based LLMs for feature extraction, clustering, and influence/polarity judgments may cause spurious features or misclustered factors—especially in term-heavy domains or under style drift—propagating errors to retrieval and generation. - “Personalized reasoning paths” are a pragmatic approximation of user cognition; they may appear plausible without being literally true. - The method is fundamentally prompt-centric; consequently, the technique
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsFocus
