Generative Hybrid Representations for Activity Forecasting with No-Regret Learning
Jiaqi Guan, Ye Yuan, Kris M. Kitani, Nicholas Rhinehart

TL;DR
This paper introduces a deep generative model that jointly predicts future human actions and motions, demonstrating improved diversity, generalization, and continual learning capabilities on a large egocentric dataset.
Contribution
The work presents a novel hybrid generative model combining discrete and continuous representations for activity forecasting, with a continual learning variant and theoretical analysis.
Findings
High-quality, diverse sample generation
Better generalization than related models
Effective continual learning from streaming data
Abstract
Automatically reasoning about future human behaviors is a difficult problem but has significant practical applications to assistive systems. Part of this difficulty stems from learning systems' inability to represent all kinds of behaviors. Some behaviors, such as motion, are best described with continuous representations, whereas others, such as picking up a cup, are best described with discrete representations. Furthermore, human behavior is generally not fixed: people can change their habits and routines. This suggests these systems must be able to learn and adapt continuously. In this work, we develop an efficient deep generative model to jointly forecast a person's future discrete actions and continuous motions. On a large-scale egocentric dataset, EPIC-KITCHENS, we observe our method generates high-quality and diverse samples while exhibiting better generalization than related…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Generative Hybrid Representations for Activity Forecasting With No-Regret Learning· youtube
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis
