The Costly Dilemma: Generalization, Evaluation and Cost-Optimal Deployment of Large Language Models
Abi Aryan, Aakash Kumar Nain, Andrew McMahon, Lucas Augusto Meyer,, Harpreet Singh Sahota

TL;DR
This paper discusses the importance of balancing generalization, evaluation, and cost-efficiency in deploying large language models, proposing a framework to optimize these factors for enterprise use.
Contribution
It introduces a novel framework tailored for large language models that addresses their generalization, evaluation, and cost modeling for practical deployment.
Findings
These three properties are often orthogonal, requiring careful balancing.
The proposed framework provides insights into deployment and management.
Enterprises should assess all three factors before investing in large language models.
Abstract
When deploying machine learning models in production for any product/application, there are three properties that are commonly desired. First, the models should be generalizable, in that we can extend it to further use cases as our knowledge of the domain area develops. Second they should be evaluable, so that there are clear metrics for performance and the calculation of those metrics in production settings are feasible. Finally, the deployment should be cost-optimal as far as possible. In this paper we propose that these three objectives (i.e. generalization, evaluation and cost-optimality) can often be relatively orthogonal and that for large language models, despite their performance over conventional NLP models, enterprises need to carefully assess all the three factors before making substantial investments in this technology. We propose a framework for generalization, evaluation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Business Process Modeling and Analysis · Data Quality and Management
