Multi-view Action Recognition via Directed Gromov-Wasserstein Discrepancy
Hoang-Quan Nguyen, Thanh-Dat Truong, Khoa Luu

TL;DR
This paper introduces a multi-view attention consistency approach using Directed Gromov-Wasserstein Discrepancy to improve action recognition accuracy and reliability, incorporating Neural Radiance Fields for multi-view feature rendering, achieving state-of-the-art results.
Contribution
It proposes a novel multi-view attention consistency method with a new metric, and integrates Neural Radiance Fields into a Video Transformer-based action recognition model.
Findings
Achieves state-of-the-art performance on Jester, Something-Something V2, and Kinetics-400 datasets.
Demonstrates improved focus on relevant action subjects through attention consistency.
Validates the effectiveness of multi-view features and Neural Radiance Fields in action recognition.
Abstract
Action recognition has become one of the popular research topics in computer vision. There are various methods based on Convolutional Networks and self-attention mechanisms as Transformers to solve both spatial and temporal dimensions problems of action recognition tasks that achieve competitive performances. However, these methods lack a guarantee of the correctness of the action subject that the models give attention to, i.e., how to ensure an action recognition model focuses on the proper action subject to make a reasonable action prediction. In this paper, we propose a multi-view attention consistency method that computes the similarity between two attentions from two different views of the action videos using Directed Gromov-Wasserstein Discrepancy. Furthermore, our approach applies the idea of Neural Radiance Field to implicitly render the features from novel views when training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Gait Recognition and Analysis
