Self Paced Gaussian Contextual Reinforcement Learning
Mohsen Sahraei Ardakani, Rui Song

TL;DR
This paper introduces SPGL, a scalable curriculum learning method for reinforcement learning that uses a closed-form Gaussian update to improve efficiency and stability in high-dimensional, partially observable environments.
Contribution
The paper presents SPGL, a novel self-paced curriculum learning approach that avoids expensive computations with a closed-form Gaussian update, enhancing scalability and stability in complex RL tasks.
Findings
SPGL matches or outperforms existing curriculum methods.
SPGL achieves more stable context distribution convergence.
SPGL is effective in hidden context scenarios.
Abstract
Curriculum learning improves reinforcement learning (RL) efficiency by sequencing tasks from simple to complex. However, many self-paced curriculum methods rely on computationally expensive inner-loop optimizations, limiting their scalability in high-dimensional context spaces. In this paper, we propose Self-Paced Gaussian Curriculum Learning (SPGL), a novel approach that avoids costly numerical procedures by leveraging a closed-form update rule for Gaussian context distributions. SPGL maintains the sample efficiency and adaptability of traditional self-paced methods while substantially reducing computational overhead. We provide theoretical guarantees on convergence and validate our method across several contextual RL benchmarks, including the Point Mass, Lunar Lander, and Ball Catching environments. Experimental results show that SPGL matches or outperforms existing curriculum…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms · Robot Manipulation and Learning
