IRT-Router: Effective and Interpretable Multi-LLM Routing via Item Response Theory

Wei Song; Zhenya Huang; Cheng Cheng; Weibo Gao; Bihan Xu; GuanHao Zhao; Fei Wang; Runze Wu

arXiv:2506.01048·cs.AI·June 24, 2025

IRT-Router: Effective and Interpretable Multi-LLM Routing via Item Response Theory

Wei Song, Zhenya Huang, Cheng Cheng, Weibo Gao, Bihan Xu, GuanHao Zhao, Fei Wang, Runze Wu

PDF

Open Access 1 Repo 1 Video

TL;DR

IRT-Router is a novel framework that uses Item Response Theory to effectively and interpretably route user queries to the most suitable large language models, balancing performance and cost.

Contribution

It introduces an IRT-based routing method that models LLM capabilities and query difficulty, providing both accurate predictions and interpretability.

Findings

01

Outperforms baseline methods in effectiveness and interpretability

02

Demonstrates strong performance in cold-start scenarios

03

Enhances online generalization with semantic similarity-based warm-up

Abstract

Large language models (LLMs) have demonstrated exceptional performance across a wide range of natural language tasks. However, selecting the optimal LLM to respond to a user query often necessitates a delicate balance between performance and cost. While powerful models deliver better results, they come at a high cost, whereas smaller models are more cost-effective but less capable. To address this trade-off, we propose IRT-Router, a multi-LLM routing framework that efficiently routes user queries to the most suitable LLM. Inspired by Item Response Theory (IRT), a psychological measurement methodology, IRT-Router explicitly models the relationship between LLM capabilities and user query attributes. This not only enables accurate prediction of response performance but also provides interpretable insights, such as LLM abilities and query difficulty. Additionally, we design an online query…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Mercidaiha/IRT-Router
pytorchOfficial

Videos

IRT-Router: Effective and Interpretable Multi-LLM Routing via Item Response Theory· underline

Taxonomy

TopicsNetwork Packet Processing and Optimization · Cooperative Communication and Network Coding