Eagle: Efficient Training-Free Router for Multi-LLM Inference

Zesen Zhao; Shuowei Jin; Z. Morley Mao

arXiv:2409.15518·cs.LG·October 30, 2024

Eagle: Efficient Training-Free Router for Multi-LLM Inference

Zesen Zhao, Shuowei Jin, Z. Morley Mao

PDF

Open Access

TL;DR

Eagle is a training-free, scalable LLM routing method that improves model selection accuracy and efficiency in high-volume online environments by combining global and local ranking modules.

Contribution

Eagle introduces a novel training-free LLM routing approach using global and local ELO ranking modules for better scalability and real-time adaptation.

Findings

01

Outperforms baseline methods with up to 23.52% AUC improvement.

02

Requires only 1/20 of baseline initialization time.

03

Offers 100-200x faster incremental updates in online scenarios.

Abstract

The proliferation of Large Language Models (LLMs) with varying capabilities and costs has created a need for efficient model selection in AI systems. LLM routers address this need by dynamically choosing the most suitable model for a given query based on task requirements and budget constraints. However, existing routers face challenges in scalability and real-time adaptation, particularly in high-volume online environments. We present Eagle, a novel LLM routing approach that combines global and local ELO ranking modules to overcome these limitations. By evaluating both general and specialized LLM abilities, Eagle provides a scalable, training-free solution that enhances model selection quality while reducing computational overhead. Our experiments across multiple datasets show Eagle consistently outperforms baseline methods, with improvements of up to 23.52 percent in Area Under Curve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInternet Traffic Analysis and Secure E-voting · Traffic Prediction and Management Techniques · Speech Recognition and Synthesis