ReGraP-LLaVA: Reasoning enabled Graph-based Personalized Large Language and Vision Assistant
Yifan Xiang, Zhenxi Zhang, Bin Li, Yixuan Weng, Shoujun Zhou, Yangfan He, Keqin Li

TL;DR
ReGraP-LLaVA introduces a novel dataset and model for personalized multi-modal reasoning, enabling structured relational understanding among personalized concepts, and achieves state-of-the-art performance on diverse reasoning tasks.
Contribution
The paper presents ReGraP, a new dataset with structured reasoning pathways, and ReGraP-LLaVA, a model trained on this data to enhance personalized relational reasoning in multimodal tasks.
Findings
ReGraP-LLaVA outperforms existing models on the ReGraP benchmark.
The dataset enables training models to reason over relations among personalized concepts.
Graph prompting methods improve the alignment of knowledge graphs within the model.
Abstract
Recent advances in personalized MLLMs enable effective capture of user-specific concepts, supporting both recognition of personalized concepts and contextual captioning. However, humans typically explore and reason over relations among objects and individuals, transcending surface-level information to achieve more personalized and contextual understanding. To this end, existing methods may face three main limitations: Their training data lacks multi-object sets in which relations among objects are learnable. Building on the limited training data, their models overlook the relations between different personalized concepts and fail to reason over them. Their experiments mainly focus on a single personalized concept, where evaluations are limited to recognition and captioning tasks. To address the limitations, we present a new dataset named ReGraP, consisting of 120 sets of personalized…
Peer Reviews
Decision·Submitted to ICLR 2026
1. The motivation for this work is clear and compelling. The authors correctly identify that existing personalized MLLMs focus primarily on concept recognition and captioning, while neglecting the relational knowledge and reasoning capabilities that humans naturally employ when understanding personalized contexts. 2. The novelty of the approach is strong. To my knowledge, this is the first work to explicitly construct knowledge graphs for personalized concepts and use them to train MLLMs with r
1. The evaluation setup and dataset descriptions are somewhat unclear throughout the paper. While the datasets are described in the main text (Section 5), the tables themselves do not clearly indicate which dataset is being evaluated. For instance, Table 2 does not specify that it evaluates on the ReGraP dataset, while Table 3 evaluates on Yo'LLaVA and MyVLM datasets with different tasks. The authors should add explicit dataset identifiers to table captions and within the tables themselves to im
1. It's a crucial problem to enable MLLMs to perform relational reasoning over multiple personalized concepts. 2. The proposed framework based on soft and/or hard graph prompting is well-designed to enhance the relational reasoning capabilities of MLLMs. 3. The paper develops a data generation pipeline for relational question answering synthesis, and also introduces a new dataset and benchmark named ReGraP, which are valuable resources for future research in this area. 4. The paper is well-writt
1. The paper extends the idea of soft/hard prompting beyond previous works (e.g., Yo'LLaVA) by integrating reasoning over knowledge graphs. However, since prompting-based personalization has been explored before, the novelty mainly lies in using structured graph representations and CoT QA data, which could be better emphasized. 2. The paper lacks comparison with several related personalization methods such as RAP-LLaVA, UniCTokens and RePIC. Including or discussing these baselines would strength
This paper identifies an evaluation gap in relational reasoning for personalized MLLM-based understanding. To address this limitation, the authors incorporate both knowledge graphs (KGs) and chain-of-thought (CoT) reasoning into multi-object personalized MLLMs. They further propose a data generation pipeline to construct a new benchmark dataset, ReGraP, supporting the evaluation of such personalized relational reasoning abilities.
1. **Limited scope and representativeness of the proposed dataset**. The diversity of concepts, relations, and scenarios covered in ReGraP remains narrow. Most scenes revolve around anime characters and personal items, resulting in a limited semantic scope. While such content may be common in personalization research, the benchmark lacks a clear definition or demonstration of “personalization.” In addition, the relational types are shallow. Most attribute or role associations, such as “who is th
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Graph Neural Networks · Topic Modeling
MethodsSparse Evolutionary Training · Focus · ALIGN
