Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing

Yi-Kai Zhang; De-Chuan Zhan; Han-Jia Ye

arXiv:2502.17282·cs.CL·February 25, 2025

Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing

Yi-Kai Zhang, De-Chuan Zhan, Han-Jia Ye

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel paradigm called Capability Instruction Tuning for dynamically routing the best-performing LLMs for specific instructions, significantly improving overall performance and model utilization.

Contribution

It proposes the Model-SAT framework that assesses model capabilities through capability instructions, enabling effective real-time model routing without candidate inference.

Findings

01

Model-SAT accurately predicts model capabilities across 50 tasks.

02

State-of-the-art performance in model routing without candidate inference.

03

Effective deployment on new models with quick aptitude inference.

Abstract

Large Language Models (LLMs) have demonstrated human-like instruction-following abilities, particularly those exceeding 100 billion parameters. The combined capability of some smaller, resource-friendly LLMs can address most of the instructions that larger LLMs excel at. In this work, we explore how to route the best-performing LLM for each instruction to achieve better overall performance. We develop a new paradigm, constructing capability instructions with model capability representation, user instruction, and performance inquiry prompts to assess the performance. To learn from capability instructions, we introduce a new end-to-end framework called Model Selection with Aptitude Test (Model-SAT), which generates positive and negative samples based on what different models perform well or struggle with. Model-SAT uses a model capability encoder that extends its model representation to a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

now-join-us/cit-llm-routing
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Rights Management and Security · Power Systems and Technologies