ProgAgent:A Continual RL Agent with Progress-Aware Rewards
Jinzhou Tan, Gabriel Adineera, Jinoh Kim

TL;DR
ProgAgent is a continual reinforcement learning system that learns dense, progress-aware rewards from unlabeled videos, improving lifelong robotic learning by reducing forgetting and enhancing skill acquisition through a unified, high-throughput architecture.
Contribution
It introduces a novel progress-aware reward learning method combined with a JAX-native system supporting scalable, stable continual RL, addressing reward specification and catastrophic forgetting.
Findings
Reduces forgetting in continual learning scenarios
Speeds up learning and improves performance on benchmarks
Successfully transfers to real-robot manipulation tasks
Abstract
We present ProgAgent, a continual reinforcement learning (CRL) agent that unifies progress-aware reward learning with a high-throughput, JAX-native system architecture. Lifelong robotic learning grapples with catastrophic forgetting and the high cost of reward specification. ProgAgent tackles these by deriving dense, shaped rewards from unlabeled expert videos through a perceptual model that estimates task progress across initial, current, and goal observations. We theoretically interpret this as a learned state-potential function, delivering robust guidance in line with expert behaviors. To maintain stability amid online exploration - where novel, out-of-distribution states arise - we incorporate an adversarial push-back refinement that regularizes the reward model, curbing overconfident predictions on non-expert trajectories and countering distribution shift. By embedding this reward…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
