ThriftLLM: On Cost-Effective Selection of Large Language Models for Classification Queries

Keke Huang; Yimin Shi; Dujian Ding; Yifei Li; Yang Fei; Laks Lakshmanan; Xiaokui Xiao

arXiv:2501.04901·cs.DB·January 8, 2026

ThriftLLM: On Cost-Effective Selection of Large Language Models for Classification Queries

Keke Huang, Yimin Shi, Dujian Ding, Yifei Li, Yang Fei, Laks Lakshmanan, Xiaokui Xiao

PDF

Open Access

TL;DR

This paper introduces ThriftLLM, a framework for cost-effective selection of large language model ensembles to maximize classification accuracy within budget constraints, addressing a gap in existing research.

Contribution

It formalizes the ensemble selection problem with a correctness probability metric and proposes an algorithm with approximation guarantees for cost-constrained LLM selection.

Findings

01

ThriftLLM effectively balances cost and performance in LLM ensemble selection.

02

The correctness probability function is non-decreasing but non-submodular.

03

The proposed algorithm provides an instance-dependent approximation guarantee.

Abstract

In recent years, large language models (LLMs) have demonstrated remarkable capabilities in comprehending and generating natural language content, attracting widespread attention in both industry and academia. An increasing number of services offer LLMs for various tasks via APIs. Different LLMs demonstrate expertise in different domains of queries (e.g., text classification queries). Meanwhile, LLMs of different scales, complexities, and performance are priced diversely. Driven by this, several researchers are investigating strategies for selecting an ensemble of LLMs, aiming to decrease overall usage costs while enhancing performance. However, to the best of our knowledge, none of the existing works addresses the problem, how to find an LLM ensemble subject to a cost budget, which maximizes the ensemble performance with guarantees. In this paper, we formalize the performance of an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management