Rational Tuning of LLM Cascades via Probabilistic Modeling

Michael J. Zellinger; Matt Thomson

arXiv:2501.09345·cs.LG·June 10, 2025

Rational Tuning of LLM Cascades via Probabilistic Modeling

Michael J. Zellinger, Matt Thomson

PDF

Open Access

TL;DR

This paper introduces a probabilistic model for optimizing the performance of cascaded large language models (LLMs), improving error-cost trade-offs and sample efficiency in tuning their confidence thresholds.

Contribution

It presents a novel Markov-copula probabilistic model for joint LLM cascade performance, enabling rational threshold tuning and outperforming Bayesian optimization.

Findings

01

4.3% average improvement in error-cost trade-offs

02

10.2% improvement with limited training data

03

Enhanced sample efficiency in cascade tuning

Abstract

Understanding the reliability of large language models (LLMs) has recently garnered significant attention. Given LLMs' propensity to hallucinate, as well as their high sensitivity to prompt design, it is already challenging to predict the performance of an individual LLM. However, the problem becomes more complex for compound LLM systems such as cascades, where in addition to each model's standalone performance, we must understand how the error rates of different models interact. In this paper, we present a probabilistic model for the joint performance distribution of a sequence of LLMs, which enables a framework for rationally tuning the confidence thresholds of a LLM cascade using continuous optimization. Compared to selecting confidence thresholds using Bayesian optimization, our parametric Markov-copula model yields more favorable error-cost trade-offs, improving the area under the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNuclear reactor physics and engineering