TL;DR
This paper presents a recurrent neural network approach that predicts multiple possible human action sequences in human-robot cooperation, utilizing gaze and body pose cues, to improve robot understanding and anticipation of human actions.
Contribution
It introduces an encoder-decoder RNN model for predicting variable-length multiple action sequences and demonstrates the importance of this for better human-robot cooperation.
Findings
Effective prediction of multiple action sequences achieved
Gaze and body pose are validated as key predictive cues
Model trained successfully on human motion datasets
Abstract
Close human-robot cooperation is a key enabler for new developments in advanced manufacturing and assistive applications. Close cooperation require robots that can predict human actions and intent, and understand human non-verbal cues. Recent approaches based on neural networks have led to encouraging results in the human action prediction problem both in continuous and discrete spaces. Our approach extends the research in this direction. Our contributions are three-fold. First, we validate the use of gaze and body pose cues as a means of predicting human action through a feature selection method. Next, we address two shortcomings of existing literature: predicting multiple and variable-length action sequences. This is achieved by introducing an encoder-decoder recurrent neural network topology in the discrete action prediction problem. In addition, we theoretically demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
