Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineering beyond Reward Maximization
Shixiang Shane Gu, Manfred Diaz, Daniel C. Freeman, Hiroki Furuta,, Seyed Kamyar Seyed Ghasemipour, Anton Raichuk, Byron David, Erik Frey, Erwin, Coumans, Olivier Bachem

TL;DR
Braxlines is a fast, interactive toolkit that enables RL-driven behavior generation beyond reward maximization, supporting unsupervised skill learning and environment creation with minimal training time.
Contribution
It introduces Braxlines, a toolkit with a programmatic API and stable baselines for behavior synthesis beyond reward maximization, facilitating rapid environment and behavior development.
Findings
Supports unsupervised skill learning and distribution sketching.
Enables behavior synthesis within minutes of training.
Provides standardized metrics for evaluating non-reward-based algorithms.
Abstract
The goal of continuous control is to synthesize desired behaviors. In reinforcement learning (RL)-driven approaches, this is often accomplished through careful task reward engineering for efficient exploration and running an off-the-shelf RL algorithm. While reward maximization is at the core of RL, reward engineering is not the only -- sometimes nor the easiest -- way for specifying complex behaviors. In this paper, we introduce \braxlines, a toolkit for fast and interactive RL-driven behavior generation beyond simple reward maximization that includes Composer, a programmatic API for generating continuous control environments, and set of stable and well-tested baselines for two families of algorithms -- mutual information maximization (MiMax) and divergence minimization (DMin) -- supporting unsupervised skill learning and distribution sketching as other modes of behavior specification.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
