LangProp: A code optimization framework using Large Language Models applied to driving
Shu Ishida, Gianluca Corrado, George Fedoseev, Hudson Yeo, Lloyd, Russell, Jamie Shotton, Jo\~ao F. Henriques, Anthony Hu

TL;DR
LangProp is a novel framework that iteratively improves code generated by large language models through automatic evaluation and feedback, applicable to diverse domains including autonomous driving, enhancing code quality and robustness.
Contribution
It introduces LangProp, a new iterative optimization framework for LLM-generated code, integrating evaluation, feedback, and learning techniques for improved performance.
Findings
Effective in Sudoku and CartPole domains
First proof of concept for autonomous driving code optimization
Produces interpretable and verifiable policies
Abstract
We propose LangProp, a framework for iteratively optimizing code generated by large language models (LLMs), in both supervised and reinforcement learning settings. While LLMs can generate sensible coding solutions zero-shot, they are often sub-optimal. Especially for code generation tasks, it is likely that the initial code will fail on certain edge cases. LangProp automatically evaluates the code performance on a dataset of input-output pairs, catches any exceptions, and feeds the results back to the LLM in the training loop, so that the LLM can iteratively improve the code it generates. By adopting a metric- and data-driven training paradigm for this code optimization procedure, one could easily adapt findings from traditional machine learning techniques such as imitation learning, DAgger, and reinforcement learning. We show LangProp's applicability to general domains such as Sudoku…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research
MethodsEntropy Regularization · Proximal Policy Optimization · CARLA: An Open Urban Driving Simulator
