TL;DR
This paper introduces LIMEN, a framework that automatically discovers reinforcement learning interfaces from raw simulator states by evolving observation and reward functions guided by large language models, reducing manual effort.
Contribution
LIMEN is a novel LLM-guided evolutionary approach that jointly synthesizes observation mappings and reward functions for RL tasks from raw state data.
Findings
Joint evolution of observations and rewards outperforms single-component optimization.
LIMEN successfully discovers effective interfaces across diverse domains.
Automatic interface construction reduces manual engineering in RL environments.
Abstract
Reinforcement learning systems rely on environment interfaces that specify observations and reward functions, yet constructing these interfaces for new tasks often requires substantial manual effort. While recent work has automated reward design using large language models (LLMs), these approaches assume fixed observations and do not address the broader challenge of synthesizing complete task interfaces. We study RL task interface discovery from raw simulator state, where both observation mappings and reward functions must be generated. We propose LIMEN (Code available at https://github.com/Lossfunk/LIMEN), a LLM guided evolutionary framework that produces candidate interfaces as executable programs and iteratively refines them using policy training feedback. Across novel discrete gridworld tasks and continuous control domains spanning locomotion and manipulation, joint evolution of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
