GMTRouter: Personalized LLM Router over Multi-turn User Interactions
Encheng Xie, Yihang Sun, Tao Feng, Jiaxuan You

TL;DR
GMTRouter is a personalized LLM routing method that models multi-turn user interactions as a heterogeneous graph, enabling effective few-shot adaptation and outperforming existing approaches in response accuracy and user preference modeling.
Contribution
The paper introduces GMTRouter, a novel graph-based framework that captures complex user-LLM interactions for personalized routing with limited data, advancing beyond prior non-personalized methods.
Findings
GMTRouter achieves 0.9 to 21.6% higher accuracy than baselines.
It effectively adapts to new users with few-shot data.
The approach outperforms existing methods across multiple datasets.
Abstract
Large Language Model (LLM) routing has demonstrated strong capability in balancing response quality with computational cost. As users exhibit diverse preferences, personalization has attracted increasing attention in LLM routing, since even identical queries may require different models to generate responses tailored to individual needs. However, existing approaches are not fully personalized and often fail to capture the complex interactions between specific users and LLMs. Moreover, user preference data is typically scarce, noisy, and inconsistent in format, which limits the effectiveness of methods that rely solely on user-specific data. To address these challenges, we propose GMTRouter, which represents multi-turn user-LLM interactions as a heterogeneous graph with four node types: user, LLM, query, and response, thereby preserving the rich relational structure of the interaction.…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
* The construction of the heterogeneous graph with four types of nodes (users, queries, responses, and LLMs) for personalized LLM routing is convincing. * The proposed approach outperforms existing router baselines, even under the challenging scenarios (e.g., handling new users or with few data samples). * The proposed router design is very efficient, requiring only minimal computing.
* The way that the authors construct the synthetic datasets for benchmarking and their quality is questionable. The datasets, such as MT-Bench, GSM8K, and MMLU, are not designed for personalization, and some of them are also not for multi-turn scenarios. How do the authors convert them to the multi-turn personalization settings? Additionally, how do you define users in those settings? More clarifications on them are needed. * On the other hand, the scale of the realistic dataset (Chatbot Arena)
1. The paper tests on one real-world and three synthetic benchmarks, builds multi-user labels that combine quality, cost, length, and rare-word signals, reports clear metrics, and assess generalization. 2. Practical, resource-efficient design suitable for deployment. GMTRouter is small, with modest storage and max GPU usage (~4.3 GB), and the experiments run on a single RTX A6000.
1. Missing baselines. It seems that the paper is missing some baselines achieving the personalization through graph. For example, Knowledge Graph Tuning: Real-time Large Language Model Personalization based on Human Feedback. 2. This paper is claiming it is a LLM routing based model, however, it does not compare to baselines of routed LLMs. For example, RouteLLM: Learning to Route LLMs with Preference Data. 3. What is the key challenge that this paper is trying to solve, all the modules seems p
1. The proposed GMTRouter models multi-turn user–LLM interactions using a heterogeneous graph, capturing complex relational dependencies for personalization. 2. The inductive graph learning framework allows GMTRouter to adapt to new users with minimal data. 3. Experimental results show that GMTRouter consistently outperforms baselines across multiple datasets.
1. The paper heavily relies on LLM routing for personalization, but this essentially requires the model itself to have inherent personalization capabilities, making the routing process somewhat secondary. 2. Memory construction is a promising approach for personalization, but the paper does not provide a comparison with existing memory-based methods. 3. The graph construction process is time-intensive; however, the paper lacks a detailed analysis of the cost involved in updating the graph post-
1. Modeling multi-turn user-LLM interactions as a heterogeneous graph with explicit node types and virtual "turn" nodes is a novel approach to personalized routing. 2. The inductive training strategy allows adaptation to new users with minimal interaction data, addressing the common cold-start problem. 3. The framework is lightweight and practical, requiring modest computational resources, which enhances its potential for real-world deployment in LLM routing systems.
1. The design of GMTRouter is largely empirical and heuristic. There is no theoretical justification for why the proposed graph structure and the HGT-based model can better extract and generalize user preferences compared to other models. 2. The main novelty lies in the data modeling design (heterogeneous graph construction), while the learning components (e.g., HGT backbone, inductive training) largely follow existing graph learning literature. 3. There are many duplicated references, e.g., (Ch
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Mobile Crowdsensing and Crowdsourcing · Advanced Neural Network Applications
