PROGRESSOR: A Perceptually Guided Reward Estimator with Self-Supervised Online Refinement
Tewodros Ayalew, Xiao Zhang, Kevin Yuanbo Wu, Tianchong Jiang, Michael, Maire, Matthew R. Walter

TL;DR
PROGRESSOR is a self-supervised, perceptually guided reward estimator that learns task progress from videos and refines rewards online, enabling robots to learn complex behaviors without manual supervision.
Contribution
It introduces a novel self-supervised reward learning framework that refines rewards adversarially during online RL, improving robotic learning from videos without task-specific data.
Findings
Enables robots to learn complex behaviors without external supervision.
Outperforms existing methods in real-robot offline RL tasks.
Requires no fine-tuning on in-domain task-specific data.
Abstract
We present PROGRESSOR, a novel framework that learns a task-agnostic reward function from videos, enabling policy training through goal-conditioned reinforcement learning (RL) without manual supervision. Underlying this reward is an estimate of the distribution over task progress as a function of the current, initial, and goal observations that is learned in a self-supervised fashion. Crucially, PROGRESSOR refines rewards adversarially during online RL training by pushing back predictions for out-of-distribution observations, to mitigate distribution shift inherent in non-expert observations. Utilizing this progress prediction as a dense reward together with an adversarial push-back, we show that PROGRESSOR enables robots to learn complex behaviors without any external supervision. Pretrained on large-scale egocentric human video from EPIC-KITCHENS, PROGRESSOR requires no fine-tuning on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Neural Networks and Applications · Stock Market Forecasting Methods
