Generating Code World Models with Large Language Models Guided by Monte   Carlo Tree Search

Nicola Dainese; Matteo Merler; Minttu Alakuijala; Pekka Marttinen

arXiv:2405.15383·cs.AI·October 31, 2024·2 cites

Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search

Nicola Dainese, Matteo Merler, Minttu Alakuijala, Pekka Marttinen

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces GIF-MCTS, a novel strategy combining Monte Carlo Tree Search with large language models to generate accurate, reliable, and efficient code-based world models for reinforcement learning, validated on a new benchmark.

Contribution

The paper presents GIF-MCTS, a new code generation method guided by MCTS, and introduces CWMB, a benchmark for evaluating code world models in RL tasks.

Findings

01

GIF-MCTS outperforms all baselines on CWMB and other benchmarks.

02

Code world models generated with GIF-MCTS improve RL sample efficiency.

03

The approach enables fast, interpretable, and reliable model-based RL agents.

Abstract

In this work we consider Code World Models, world models generated by a Large Language Model (LLM) in the form of Python code for model-based Reinforcement Learning (RL). Calling code instead of LLMs for planning has potential to be more precise, reliable, interpretable, and extremely efficient. However, writing appropriate Code World Models requires the ability to understand complex instructions, to generate exact code with non-trivial logic and to self-debug a long program with feedback from unit tests and environment trajectories. To address these challenges, we propose Generate, Improve and Fix with Monte Carlo Tree Search (GIF-MCTS), a new code generation strategy for LLMs. To test our approach in an offline RL setting, we introduce the Code World Models Benchmark (CWMB), a suite of program synthesis and planning tasks comprised of 18 diverse RL environments paired with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nicoladainese96/code-world-models
pytorchOfficial

Videos

Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search· slideslive

Taxonomy

TopicsNatural Language Processing Techniques · Software Engineering Research · Speech and dialogue systems