A bi-objective $\epsilon$-constrained framework for quality-cost optimization in language model ensembles
Aditi Singla, Aditya Singh, Kanishk Kukreja

TL;DR
This paper introduces a bi-objective $psilon$-constrained framework for optimizing language model ensembles, balancing response quality and cost, and demonstrating superior performance over existing methods.
Contribution
It presents a novel bi-objective optimization framework that simplifies the quality-cost tradeoff in language model ensembling into a 0/1 knapsack problem.
Findings
Outperforms existing ensembling approaches in response quality.
Reduces costs significantly compared to prior methods.
Effectively balances quality and cost in LLM ensembles.
Abstract
We propose an ensembling framework that uses diverse open-sourced Large Language Models (LLMs) to achieve high response quality while maintaining cost efficiency. We formulate a bi-objective optimization problem to represent the quality-cost tradeoff and then introduce an additional budget constraint that reduces the problem to a straightforward 0/1 knapsack problem. We empirically demonstrate that our framework outperforms the existing ensembling approaches in response quality while significantly reducing costs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
