Going Deeper into First-Person Activity Recognition
Minghuang Ma, Haoqi Fan, Kris M. Kitani

TL;DR
This paper introduces a twin stream deep CNN architecture for egocentric activity recognition, integrating appearance and motion analysis, leading to significant accuracy improvements over previous methods.
Contribution
It proposes a novel dual-stream network that explicitly models hand-object interactions and localizes objects, enhancing first-person activity recognition performance.
Findings
Achieved an average 6.6% accuracy increase over state-of-the-art methods.
Joint recognition of objects, actions, and activities improves individual task performance.
Extensive ablation studies highlight the importance of network design choices.
Abstract
We bring together ideas from recent work on feature design for egocentric action recognition under one framework by exploring the use of deep convolutional neural networks (CNN). Recent work has shown that features such as hand appearance, object attributes, local hand motion and camera ego-motion are important for characterizing first-person actions. To integrate these ideas under one framework, we propose a twin stream network architecture, where one stream analyzes appearance information and the other stream analyzes motion information. Our appearance stream encodes prior knowledge of the egocentric paradigm by explicitly training the network to segment hands and localize objects. By visualizing certain neuron activation of our network, we show that our proposed architecture naturally learns features that capture object attributes and hand-object configurations. Our extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Anomaly Detection Techniques and Applications
