Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems
Lingjiao Chen, Jared Quincy Davis, Boris Hanin, Peter Bailis, and Ion Stoica, Matei Zaharia, James Zou

TL;DR
This paper investigates how the number of language model calls in compound inference systems affects performance, revealing a non-monotonic relationship and providing a model to optimize the number of calls for best results.
Contribution
It introduces a theoretical and empirical analysis of LM call scaling laws in compound systems, highlighting non-monotonic performance behavior and offering a method to determine optimal call counts.
Findings
Performance of Vote and Filter-Vote systems first increases then decreases with more LM calls.
Non-monotonic behavior is due to diversity in query difficulty within tasks.
A scaling model accurately predicts optimal number of LM calls.
Abstract
Many recent state-of-the-art results in language tasks were achieved using compound systems that perform multiple Language Model (LM) calls and aggregate their responses. However, there is little understanding of how the number of LM calls - e.g., when asking the LM to answer each question multiple times and taking a majority vote - affects such a compound system's performance. In this paper, we initiate the study of scaling properties of compound inference systems. We analyze, theoretically and empirically, how the number of LM calls affects the performance of Vote and Filter-Vote, two of the simplest compound system designs, which aggregate LM responses via majority voting, optionally applying LM filters. We find, surprisingly, that across multiple language tasks, the performance of both Vote and Filter-Vote can first increase but then decrease as a function of the number of LM calls.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Law, AI, and Intellectual Property
