Fly-Swat or Cannon? Cost-Effective Language Model Choice via   Meta-Modeling

Marija \v{S}akota; Maxime Peyrard; Robert West

arXiv:2308.06077·cs.CL·December 19, 2023

Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

Marija \v{S}akota, Maxime Peyrard, Robert West

PDF

2 Repos

TL;DR

This paper introduces FORC, a meta-modeling framework that intelligently assigns natural language prompts to different-sized language models to optimize cost and performance across various tasks.

Contribution

We propose a novel meta-modeling approach for cost-effective language model selection that adapts to input difficulty, reducing costs while maintaining high performance.

Findings

01

FORC achieves up to 63% cost reduction compared to using the largest LM.

02

It matches the performance of the largest LM across multiple datasets.

03

The framework is flexible and can be tuned for different cost-performance tradeoffs.

Abstract

Generative language models (LMs) have become omnipresent across data science. For a wide variety of tasks, inputs can be phrased as natural language prompts for an LM, from whose output the solution can then be extracted. LM performance has consistently been increasing with model size - but so has the monetary cost of querying the ever larger models. Importantly, however, not all inputs are equally hard: some require larger LMs for obtaining a satisfactory solution, whereas for others smaller LMs suffice. Based on this fact, we design a framework for cost-effective language model choice, called "Fly-swat or cannon" (FORC). Given a set of inputs and a set of candidate LMs, FORC judiciously assigns each input to an LM predicted to do well on the input according to a so-called meta-model, aiming to achieve high overall performance at low cost. The cost-performance tradeoff can be flexibly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.