# Scalable Option Learning in High-Throughput Environments

**Authors:** Mikael Henaff, Scott Fujimoto, Michael Matthews, Michael Rabbat

arXiv: 2509.00338 · 2026-05-11

## TL;DR

This paper introduces SOL, a scalable hierarchical reinforcement learning algorithm that significantly improves training throughput and demonstrates effectiveness across complex environments like NetHack, MiniHack, and Mujoco.

## Contribution

The authors develop a new scalable hierarchical RL method, SOL, achieving 35x higher throughput and validating its performance on large-scale and diverse tasks.

## Key findings

- SOL achieves approximately 35x higher throughput than existing methods.
- Hierarchical agents trained with SOL outperform flat agents in NetHack.
- SOL demonstrates positive scaling trends across multiple environments.

## Abstract

Hierarchical reinforcement learning (RL) has the potential to enable effective decision-making over long timescales. Existing approaches, while promising, have yet to realize the benefits of large-scale training. In this work, we identify and solve several key challenges in scaling online hierarchical RL to high-throughput environments. We propose Scalable Option Learning (SOL), a highly scalable hierarchical RL algorithm which achieves a ~35x higher throughput compared to existing hierarchical methods. To demonstrate SOL's performance and scalability, we train hierarchical agents using 30 billion frames of experience on the complex game of NetHack, significantly surpassing flat agents and demonstrating positive scaling trends. We also validate SOL on MiniHack and Mujoco environments, showcasing its general applicability. Our code is open sourced at: github.com/facebookresearch/sol.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2509.00338/full.md

## Figures

34 figures with captions in the complete paper: https://tomesphere.com/paper/2509.00338/full.md

## References

75 references — full list in the complete paper: https://tomesphere.com/paper/2509.00338/full.md

---
Source: https://tomesphere.com/paper/2509.00338