Cost-Aware Learning
Clara Mohri, Amir Globerson, Haim Kaplan, Tomer Koren, Yishay Mansour

TL;DR
This paper introduces cost-aware algorithms for learning tasks, optimizing the trade-off between error reduction and varying sampling costs, with applications to large language models.
Contribution
It proposes new cost-aware stochastic gradient methods, establishes theoretical bounds, and applies these ideas to reduce costs in reinforcement learning with large language models.
Findings
Cost-aware SGD achieves target error with reduced total cost.
Lower bounds for cost-aware learning are established.
Cost-Aware GRPO reduces token usage by up to 30% in large language models.
Abstract
We consider the problem of Cost-Aware Learning, where sampling different component functions of a finite-sum objective incurs different costs. The objective is to reach a target error while minimizing the total cost. First, we propose the Cost-Aware Stochastic Gradient Descent algorithm for convex functions, and derive its cost complexity to attain an error of . Furthermore, we establish a lower bound for this setting and provide a subset selection algorithm to further reduce the cost of training. We apply our theoretical insights to reinforcement learning with language models, where the computational cost of policy gradients varies with sequence length. To this end, we introduce Cost-Aware GRPO, an algorithm designed to reduce the cost of policy optimization while preserving performance. Empirical results on 1.5B and 8B LLMs demonstrate that our approach reduces the tokens…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
