Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing
Yi-Kai Zhang, De-Chuan Zhan, Han-Jia Ye

TL;DR
This paper introduces a novel paradigm called Capability Instruction Tuning for dynamically routing the best-performing LLMs for specific instructions, significantly improving overall performance and model utilization.
Contribution
It proposes the Model-SAT framework that assesses model capabilities through capability instructions, enabling effective real-time model routing without candidate inference.
Findings
Model-SAT accurately predicts model capabilities across 50 tasks.
State-of-the-art performance in model routing without candidate inference.
Effective deployment on new models with quick aptitude inference.
Abstract
Large Language Models (LLMs) have demonstrated human-like instruction-following abilities, particularly those exceeding 100 billion parameters. The combined capability of some smaller, resource-friendly LLMs can address most of the instructions that larger LLMs excel at. In this work, we explore how to route the best-performing LLM for each instruction to achieve better overall performance. We develop a new paradigm, constructing capability instructions with model capability representation, user instruction, and performance inquiry prompts to assess the performance. To learn from capability instructions, we introduce a new end-to-end framework called Model Selection with Aptitude Test (Model-SAT), which generates positive and negative samples based on what different models perform well or struggle with. Model-SAT uses a model capability encoder that extends its model representation to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Rights Management and Security · Power Systems and Technologies
