CODE-SHARP: Continuous Open-ended Discovery and Evolution of Skills as Hierarchical Reward Programs

Richard Bornemann; Pierluigi Vito Amadori; Antoine Cully

arXiv:2602.10085·cs.AI·May 22, 2026

CODE-SHARP: Continuous Open-ended Discovery and Evolution of Skills as Hierarchical Reward Programs

Richard Bornemann, Pierluigi Vito Amadori, Antoine Cully

PDF

TL;DR

CODE-SHARP introduces a framework that autonomously discovers and evolves skills as hierarchical reward programs using foundation models, enabling agents to learn complex tasks from scratch without human engineering.

Contribution

It presents a novel method for open-ended skill discovery and evolution using hierarchical reward programs encoded as Python scripts, reducing reliance on human-designed rewards.

Findings

01

Agents outperform previous methods by 6x and 2.6x in median performance on Craftax-Classic and XLand.

02

Agents trained with CODE-SHARP can craft tools and mine diamonds, demonstrating advanced capabilities.

03

Zero-shot generalization to long-horizon tasks on Craftax-Extended, matching ground-truth reward trained agents.

Abstract

A core quality of general intelligence is the ability to open-endedly expand and evolve its set of mastered skills autonomously. While recent Foundation Model (FM) driven approaches have shown promising results towards this goal, they typically rely on significant human-in-the-loop engineering, limiting their transferability to novel environments. To address this, we introduce Continuous Open-ended Discovery and Evolution of Skills as Hierarchical Reward Programs (CODE-SHARP), a framework that leverages FMs to open-endedly grow and evolve an archive of Python programs encoding skills to train a generalist agent policy entirely from scratch via reinforcement learning, directly from source code. These programs, termed Skills as Hierarchical Reward Programs (SHARPs), each encode a local success condition and a set of prerequisites delegated to previously discovered SHARPs. At runtime,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification · Artificial Intelligence in Games