Motion Guided Attention Fusion to Recognize Interactions from Videos
Tae Soo Kim, Jonathan Jones, Gregory D. Hager

TL;DR
This paper introduces a dual-pathway video interaction recognition method that explicitly models static and dynamic features, using a novel Motion-Guided Attention Fusion to improve action recognition, especially with unseen objects.
Contribution
The paper proposes a new dual-pathway approach with Motion-Guided Attention Fusion, enhancing generalization to unseen objects and outperforming state-of-the-art methods on multiple datasets.
Findings
Outperforms existing methods on Something-Something-v2 dataset.
Achieves state-of-the-art results on IKEA-ASM dataset.
Generalizes well to real-world interaction tasks.
Abstract
We present a dual-pathway approach for recognizing fine-grained interactions from videos. We build on the success of prior dual-stream approaches, but make a distinction between the static and dynamic representations of objects and their interactions explicit by introducing separate motion and object detection pathways. Then, using our new Motion-Guided Attention Fusion module, we fuse the bottom-up features in the motion pathway with features captured from object detections to learn the temporal aspects of an action. We show that our approach can generalize across appearance effectively and recognize actions where an actor interacts with previously unseen objects. We validate our approach using the compositional action recognition task from the Something-Something-v2 dataset where we outperform existing state-of-the-art methods. We also show that our method can generalize well to real…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Multimodal Machine Learning Applications
