RE-SAC: Disentangling aleatoric and epistemic risks in bus fleet control: A stable and robust ensemble DRL approach
Yifan Zhang, Liang Zheng

TL;DR
RE-SAC introduces a novel ensemble DRL method that explicitly disentangles aleatoric and epistemic uncertainties, enhancing robustness and stability in bus fleet control under stochastic traffic conditions.
Contribution
The paper proposes RE-SAC, a robust ensemble soft actor-critic framework that separately addresses aleatoric and epistemic risks using IPM-based regularization and diversified Q-ensembles.
Findings
RE-SAC outperforms vanilla SAC in simulation, achieving higher cumulative rewards.
RE-SAC reduces Q-value estimation error by up to 62% in out-of-distribution states.
The dual mechanism prevents misidentification of noise as data gaps, improving robustness.
Abstract
Bus holding control is challenging due to stochastic traffic and passenger demand. While deep reinforcement learning (DRL) shows promise, standard actor-critic algorithms suffer from Q-value instability in volatile environments. A key source of this instability is the conflation of two distinct uncertainties: aleatoric uncertainty (irreducible noise) and epistemic uncertainty (data insufficiency). Treating these as a single risk leads to value underestimation in noisy states, causing catastrophic policy collapse. We propose a robust ensemble soft actor-critic (RE-SAC) framework to explicitly disentangle these uncertainties. RE-SAC applies Integral Probability Metric (IPM)-based weight regularization to the critic network to hedge against aleatoric risk, providing a smooth analytical lower bound for the robust Bellman operator without expensive inner-loop perturbations. To address…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic control and management · Reinforcement Learning in Robotics · Autonomous Vehicle Technology and Safety
