Parameter-Efficient Distributional RL via Normalizing Flows and a Geometry-Aware Cram\'er Surrogate

Simo Alami C.; Rim Kaddah; Jesse Read; Marie-Paule Cani

arXiv:2505.04310·cs.AI·May 6, 2026

Parameter-Efficient Distributional RL via Normalizing Flows and a Geometry-Aware Cram\'er Surrogate

Simo Alami C., Rim Kaddah, Jesse Read, Marie-Paule Cani

PDF

TL;DR

NFDRL introduces a parameter-efficient, flow-based distributional RL method that models complex return distributions with adaptive support, outperforming traditional fixed-support approaches.

Contribution

The paper presents NFDRL, a novel flow-based distributional RL architecture with a geometry-aware training objective, offering parameter efficiency and theoretical guarantees.

Findings

01

NFDRL recovers multi-modal return landscapes on toy MDPs.

02

NFDRL achieves competitive performance on Atari-5 benchmark.

03

NFDRL offers better parameter efficiency than categorical methods.

Abstract

Distributional Reinforcement Learning (DistRL) improves upon expectation-based methods by modeling full return distributions, but standard approaches often remain far from parsimonious. Categorical methods (e.g., C51) rely on fixed supports where parameter counts scale linearly with resolution, while quantile methods approximate distributions as discrete mixtures whose piecewise-constant densities can be wasteful when modeling complex multi-modal or heavy-tailed returns. We introduce NFDRL, a parsimonious architecture that models return distributions using continuous normalizing flows. Unlike categorical baselines, our flow-based model maintains a compact parameter footprint that does not grow with the effective resolution of the distribution, while providing a dynamic, adaptive support for returns. To train this continuous representation, we propose a Cram\'er-inspired, geometry-aware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.