Learning Robot Activities from First-Person Human Videos Using Convolutional Future Regression
Jangwon Lee, Michael S. Ryoo

TL;DR
This paper presents a deep learning approach enabling robots to learn new activities from unlabeled first-person human videos by predicting future scene states and transferring this knowledge for real-time robot execution.
Contribution
It introduces a novel convolutional future regression model that predicts future hand and object locations from first-person videos, facilitating robot activity learning without labeled data.
Findings
Robots can learn activities from unlabeled videos.
The model accurately predicts future hand and object positions.
Robots execute learned activities in real-time based on camera input.
Abstract
We design a new approach that allows robot learning of new activities from unlabeled human example videos. Given videos of humans executing the same activity from a human's viewpoint (i.e., first-person videos), our objective is to make the robot learn the temporal structure of the activity as its future regression network, and learn to transfer such model for its own motor execution. We present a new deep learning model: We extend the state-of-the-art convolutional object detection network for the representation/estimation of human hands in training videos, and newly introduce the concept of using a fully convolutional network to regress (i.e., predict) the intermediate scene representation corresponding to the future frame (e.g., 1-2 seconds later). Combining these allows direct prediction of future locations of human hands and objects, which enables the robot to infer the motor…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Robot Manipulation and Learning · Multimodal Machine Learning Applications
