Prompt Smart, Pay Less: Cost-Aware APO for Real-World Applications
Jayesh Choudhari, Piyush Kumar Singh, Douglas McIlwraith, Snehal Nair

TL;DR
This paper evaluates and compares different Automatic Prompt Optimization methods for large language models in real-world, high-stakes classification tasks, introducing a new hybrid framework that improves cost-efficiency.
Contribution
It introduces APE-OPRO, a novel hybrid APO framework that outperforms existing methods in cost-efficiency without sacrificing accuracy in practical applications.
Findings
APE-OPRO achieves 18% better cost-efficiency than OPRO.
ProTeGi offers the best performance but at higher computational cost.
Sensitivity to label formatting affects LLM prompt optimization.
Abstract
Prompt design is a critical factor in the effectiveness of Large Language Models (LLMs), yet remains largely heuristic, manual, and difficult to scale. This paper presents the first comprehensive evaluation of Automatic Prompt Optimization (APO) methods for real-world, high-stakes multiclass classification in a commercial setting, addressing a critical gap in the existing literature where most of the APO frameworks have been validated only on benchmark classification tasks of limited complexity. We introduce APE-OPRO, a novel hybrid framework that combines the complementary strengths of APE and OPRO, achieving notably better cost-efficiency, around improvement over OPRO, without sacrificing performance. We benchmark APE-OPRO alongside both gradient-free (APE, OPRO) and gradient-based (ProTeGi) methods on a dataset of ~2,500 labeled products. Our results highlight key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReal-Time Systems Scheduling
