Self-Organizing Dual-Buffer Adaptive Clustering Experience Replay (SODACER) for Safe Reinforcement Learning in Optimal Control

Roya Khalili Amirabadi; Mohsen Jalaeian Farimani; Omid Solaymani Fard

arXiv:2601.06540·eess.SY·April 14, 2026

Self-Organizing Dual-Buffer Adaptive Clustering Experience Replay (SODACER) for Safe Reinforcement Learning in Optimal Control

Roya Khalili Amirabadi, Mohsen Jalaeian Farimani, Omid Solaymani Fard

PDF

TL;DR

This paper introduces SODACER, a reinforcement learning framework with dual buffers and adaptive clustering, combined with safety guarantees and an advanced optimizer, to improve learning in safety-critical nonlinear control tasks.

Contribution

The paper presents a novel experience replay mechanism with adaptive clustering and safety integration, enhancing convergence, safety, and efficiency in reinforcement learning for nonlinear systems.

Findings

01

SODACER achieves faster convergence than baseline methods.

02

The approach maintains safety constraints throughout learning.

03

It improves sample efficiency and reduces redundancy in experience replay.

Abstract

This paper proposes a novel reinforcement learning framework, named Self-Organizing Dual-buffer Adaptive Clustering Experience Replay (SODACER), designed to achieve safe and scalable optimal control of nonlinear systems. The proposed SODACER mechanism consisting of a Fast-Buffer for rapid adaptation to recent experiences and a Slow-Buffer equipped with a self-organizing adaptive clustering mechanism to maintain diverse and non-redundant historical experiences. The adaptive clustering mechanism dynamically prunes redundant samples, optimizing memory efficiency while retaining critical environmental patterns. The approach integrates SODACER with Control Barrier Functions (CBFs) to guarantee safety by enforcing state and input constraints throughout the learning process. To enhance convergence and stability, the framework is combined with the Sophia optimizer, enabling adaptive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.