RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models
Shuhao Chen, Weisen Jiang, Baijiong Lin, James T. Kwok, and Yu Zhang

TL;DR
RouterDC introduces a dual contrastive learning approach to improve query-based routing among multiple large language models, effectively assembling their strengths and outperforming existing methods on various tasks.
Contribution
The paper proposes a novel contrastive learning-based router for assembling LLMs, addressing limitations of existing routing models in multi-LLM scenarios.
Findings
Outperforms existing routing methods on in-distribution tasks (+2.76%)
Achieves better out-of-distribution performance (+1.90%)
Effectively assembles multiple LLMs for improved task performance
Abstract
Recent works show that assembling multiple off-the-shelf large language models (LLMs) can harness their complementary abilities. To achieve this, routing is a promising method, which learns a router to select the most suitable LLM for each query. However, existing routing models are ineffective when multiple LLMs perform well for a query. To address this problem, in this paper, we propose a method called query-based Router by Dual Contrastive learning (RouterDC). The RouterDC model consists of an encoder and LLM embeddings, and we propose two contrastive learning losses to train the RouterDC model. Experimental results show that RouterDC is effective in assembling LLMs and largely outperforms individual top-performing LLMs as well as existing routing methods on both in-distribution (+2.76\%) and out-of-distribution (+1.90\%) tasks. Source code is available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Algorithms and Data Compression · Natural Language Processing Techniques
MethodsContrastive Learning
