Programming by Rewards
Nagarajan Natarajan, Ajaykrishna Karthikeyan, Prateek Jain, Ivan, Radicek, Sriram Rajamani, Sumit Gulwani, Johannes Gehrke

TL;DR
This paper introduces 'programming by rewards' (PBR), a method for synthesizing decision functions using reward signals, leveraging continuous optimization within a specific DSL, with applications in search and ranking heuristics.
Contribution
It formalizes PBR, develops a continuous-optimization synthesis approach for if-then-else programs, and demonstrates its effectiveness on real-world and synthetic benchmarks.
Findings
Synthesized decision functions perform competitively with manually tuned procedures.
The framework guarantees optimality under certain reward conditions.
Applied successfully to search and ranking heuristics in industrial codebase.
Abstract
We formalize and study ``programming by rewards'' (PBR), a new approach for specifying and synthesizing subroutines for optimizing some quantitative metric such as performance, resource utilization, or correctness over a benchmark. A PBR specification consists of (1) input features , and (2) a reward function , modeled as a black-box component (which we can only run), that assigns a reward for each execution. The goal of the synthesizer is to synthesize a "decision function" which transforms the features to a decision value for the black-box component so as to maximize the expected reward for executing decisions for various values of . We consider a space of decision functions in a DSL of loop-free if-then-else programs, which can branch on linear functions of the input features in a tree-structure and compute a linear function of the inputs in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Advanced Bandit Algorithms Research · Software Testing and Debugging Techniques
