Sample-efficient Unsupervised Policy Cloning from Ensemble Self-supervised Labeled Videos
Xin Liu, Yaran Chen, and Haoran Li

TL;DR
This paper introduces UPESV, a novel framework enabling machines to learn policies from unlabeled videos without rewards or expert supervision, mimicking human learning from internet videos efficiently.
Contribution
UPESV is the first method to learn policies from action-free videos using self-supervised tasks for robust dynamics understanding and policy cloning without rewards.
Findings
Achieves state-of-the-art performance in 12 out of 16 environments.
Outperforms five baseline methods on interaction-limited policy learning.
Learns effective policies solely from unlabeled videos without additional supervision.
Abstract
Current advanced policy learning methodologies have demonstrated the ability to develop expert-level strategies when provided enough information. However, their requirements, including task-specific rewards, action-labeled expert trajectories, and huge environmental interactions, can be expensive or even unavailable in many scenarios. In contrast, humans can efficiently acquire skills within a few trials and errors by imitating easily accessible internet videos, in the absence of any other supervision. In this paper, we try to let machines replicate this efficient watching-and-learning process through Unsupervised Policy from Ensemble Self-supervised labeled Videos (UPESV), a novel framework to efficiently learn policies from action-free videos without rewards and any other expert supervision. UPESV trains a video labeling model to infer the expert actions in expert videos through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHydrological Forecasting Using AI · Advanced Technologies in Various Fields · Data Stream Mining Techniques
