Beyond Scalar Rewards: Distributional Reinforcement Learning with Preordered Objectives for Safe and Reliable Autonomous Driving

Ahmed Abouelazm; Jonas Michel; Daniel Bogdoll; Philip Sch\"orner; and J. Marius Z\"ollner

arXiv:2603.20230·cs.RO·March 24, 2026

Beyond Scalar Rewards: Distributional Reinforcement Learning with Preordered Objectives for Safe and Reliable Autonomous Driving

Ahmed Abouelazm, Jonas Michel, Daniel Bogdoll, Philip Sch\"orner, and J. Marius Z\"ollner

PDF

Open Access

TL;DR

This paper introduces a hierarchical multi-objective reinforcement learning framework for autonomous driving, using distributional RL with a novel comparison metric to prioritize safety and efficiency without collapsing objectives into a scalar.

Contribution

It proposes the Preordered Multi-Objective MDP and Quantile Dominance metric, enabling hierarchical decision-making and safer policies in autonomous driving.

Findings

01

Improved success rates in Carla simulations

02

Fewer collisions and off-road events

03

More robust policies compared to baselines

Abstract

Autonomous driving involves multiple, often conflicting objectives such as safety, efficiency, and comfort. In reinforcement learning (RL), these objectives are typically combined through weighted summation, which collapses their relative priorities and often yields policies that violate safety-critical constraints. To overcome this limitation, we introduce the Preordered Multi-Objective MDP (Pr-MOMDP), which augments standard MOMDPs with a preorder over reward components. This structure enables reasoning about actions with respect to a hierarchy of objectives rather than a scalar signal. To make this structure actionable, we extend distributional RL with a novel pairwise comparison metric, Quantile Dominance (QD), that evaluates action return distributions without reducing them into a single statistic. Building on QD, we propose an algorithm for extracting optimal subsets, the subset…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Autonomous Vehicle Technology and Safety · Adversarial Robustness in Machine Learning