Prompt Smart, Pay Less: Cost-Aware APO for Real-World Applications

Jayesh Choudhari; Piyush Kumar Singh; Douglas McIlwraith; Snehal Nair

arXiv:2507.15884·cs.LG·July 23, 2025

Prompt Smart, Pay Less: Cost-Aware APO for Real-World Applications

Jayesh Choudhari, Piyush Kumar Singh, Douglas McIlwraith, Snehal Nair

PDF

Open Access

TL;DR

This paper evaluates and compares different Automatic Prompt Optimization methods for large language models in real-world, high-stakes classification tasks, introducing a new hybrid framework that improves cost-efficiency.

Contribution

It introduces APE-OPRO, a novel hybrid APO framework that outperforms existing methods in cost-efficiency without sacrificing accuracy in practical applications.

Findings

01

APE-OPRO achieves 18% better cost-efficiency than OPRO.

02

ProTeGi offers the best performance but at higher computational cost.

03

Sensitivity to label formatting affects LLM prompt optimization.

Abstract

Prompt design is a critical factor in the effectiveness of Large Language Models (LLMs), yet remains largely heuristic, manual, and difficult to scale. This paper presents the first comprehensive evaluation of Automatic Prompt Optimization (APO) methods for real-world, high-stakes multiclass classification in a commercial setting, addressing a critical gap in the existing literature where most of the APO frameworks have been validated only on benchmark classification tasks of limited complexity. We introduce APE-OPRO, a novel hybrid framework that combines the complementary strengths of APE and OPRO, achieving notably better cost-efficiency, around $18%$ improvement over OPRO, without sacrificing performance. We benchmark APE-OPRO alongside both gradient-free (APE, OPRO) and gradient-based (ProTeGi) methods on a dataset of ~2,500 labeled products. Our results highlight key…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReal-Time Systems Scheduling