Smoothie: Label Free Language Model Routing

Neel Guha; Mayee F. Chen; Trevor Chow; Ishan S. Khare; Christopher; R\'e

arXiv:2412.04692·cs.AI·December 9, 2024

Smoothie: Label Free Language Model Routing

Neel Guha, Mayee F. Chen, Trevor Chow, Ishan S. Khare, Christopher, R\'e

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

Smoothie is an unsupervised method for routing among multiple large language models based on latent variable graphical modeling, improving model selection accuracy without labeled data.

Contribution

It introduces a novel unsupervised routing approach that constructs a graphical model over LLM outputs to estimate quality scores without labeled data.

Findings

01

Smoothie correlates well with true model quality

02

Outperforms baselines by up to 10 points accuracy

03

Successfully identifies the best model on 9 out of 14 tasks

Abstract

Large language models (LLMs) are increasingly used in applications where LLM inputs may span many different tasks. Recent work has found that the choice of LLM is consequential, and different LLMs may be good for different input samples. Prior approaches have thus explored how engineers might select an LLM to use for each sample (i.e. routing). While existing routing methods mostly require training auxiliary models on human-annotated data, our work explores whether it is possible to perform unsupervised routing. We propose Smoothie, a weak supervision-inspired routing approach that requires no labeled data. Given a set of outputs from different LLMs, Smoothie constructs a latent variable graphical model over embedding representations of observable LLM outputs and unknown "true" outputs. Using this graphical model, we estimate sample-dependent quality scores for each LLM, and route each…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hazyresearch/smoothie
noneOfficial

Datasets

hazyresearch/smoothie_data
dataset· 49 dl
49 dl

Videos

Smoothie: Label Free Language Model Routing· slideslive

Taxonomy

TopicsNatural Language Processing Techniques

MethodsSparse Evolutionary Training