Self Paced Gaussian Contextual Reinforcement Learning

Mohsen Sahraei Ardakani; Rui Song

arXiv:2603.23755·cs.LG·March 26, 2026

Self Paced Gaussian Contextual Reinforcement Learning

Mohsen Sahraei Ardakani, Rui Song

PDF

Open Access

TL;DR

This paper introduces SPGL, a scalable curriculum learning method for reinforcement learning that uses a closed-form Gaussian update to improve efficiency and stability in high-dimensional, partially observable environments.

Contribution

The paper presents SPGL, a novel self-paced curriculum learning approach that avoids expensive computations with a closed-form Gaussian update, enhancing scalability and stability in complex RL tasks.

Findings

01

SPGL matches or outperforms existing curriculum methods.

02

SPGL achieves more stable context distribution convergence.

03

SPGL is effective in hidden context scenarios.

Abstract

Curriculum learning improves reinforcement learning (RL) efficiency by sequencing tasks from simple to complex. However, many self-paced curriculum methods rely on computationally expensive inner-loop optimizations, limiting their scalability in high-dimensional context spaces. In this paper, we propose Self-Paced Gaussian Curriculum Learning (SPGL), a novel approach that avoids costly numerical procedures by leveraging a closed-form update rule for Gaussian context distributions. SPGL maintains the sample efficiency and adaptability of traditional self-paced methods while substantially reducing computational overhead. We provide theoretical guarantees on convergence and validate our method across several contextual RL benchmarks, including the Point Mass, Lunar Lander, and Ball Catching environments. Experimental results show that SPGL matches or outperforms existing curriculum…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms · Robot Manipulation and Learning