Discovering Reinforcement Learning Interfaces with Large Language Models

Akshat Singh Jaswal; Ashish Baghel; Paras Chopra

arXiv:2605.03408·cs.LG·May 6, 2026

Discovering Reinforcement Learning Interfaces with Large Language Models

Akshat Singh Jaswal, Ashish Baghel, Paras Chopra

PDF

1 Repo

TL;DR

This paper introduces LIMEN, a framework that automatically discovers reinforcement learning interfaces from raw simulator states by evolving observation and reward functions guided by large language models, reducing manual effort.

Contribution

LIMEN is a novel LLM-guided evolutionary approach that jointly synthesizes observation mappings and reward functions for RL tasks from raw state data.

Findings

01

Joint evolution of observations and rewards outperforms single-component optimization.

02

LIMEN successfully discovers effective interfaces across diverse domains.

03

Automatic interface construction reduces manual engineering in RL environments.

Abstract

Reinforcement learning systems rely on environment interfaces that specify observations and reward functions, yet constructing these interfaces for new tasks often requires substantial manual effort. While recent work has automated reward design using large language models (LLMs), these approaches assume fixed observations and do not address the broader challenge of synthesizing complete task interfaces. We study RL task interface discovery from raw simulator state, where both observation mappings and reward functions must be generated. We propose LIMEN (Code available at https://github.com/Lossfunk/LIMEN), a LLM guided evolutionary framework that produces candidate interfaces as executable programs and iteratively refines them using policy training feedback. Across novel discrete gridworld tasks and continuous control domains spanning locomotion and manipulation, joint evolution of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Lossfunk/LIMEN
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.