Prior Reinforce: Mastering Agile Tasks with Limited Trials
Yihang Hu, Pingyue Sheng, Yuyang Liu, Shengjie Wang, Yang Gao

TL;DR
Prior Reinforce is a scalable learning method inspired by human imitation, enabling robots to master agile, dynamic tasks like basketball shots with minimal trials and high precision.
Contribution
It introduces a simple approach that combines few-shot motion learning with iterative trial-based refinement for complex dynamic tasks.
Findings
Achieves goal in fewer than 10 trials for basketball shooting
Demonstrates human-level precision in real-world tasks
Effective in diverse goal-conditioned dynamic tasks
Abstract
Embodied robots nowadays can already handle many real-world manipulation tasks. However, certain other real-world tasks involving dynamic processes (e.g., shooting a basketball into a hoop) are highly agile and impose high precision requirements on the outcomes, presenting additional challenges for methods primarily designed for quasi-static manipulations. This leads to increased efforts in costly data collection, laborious reward design, or complex motion planning. Such tasks, however, are far less challenging for humans. Say a novice basketball player typically needs only about 10 attempts to make their first successful shot, by roughly imitating some motion priors and then iteratively adjusting their motion based on the past outcomes. Inspired by this human learning paradigm, we propose Prior Reinforce(P.R.), a simple and scalable approach which first learns a motion pattern from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Techniques and Practices
