Transformer-based Action recognition in hand-object interacting   scenarios

Hoseong Cho; Seungryul Baek

arXiv:2210.11387·cs.CV·October 21, 2022·1 cites

Transformer-based Action recognition in hand-object interacting scenarios

Hoseong Cho, Seungryul Baek

PDF

Open Access

TL;DR

This paper presents a Transformer-based framework for recognizing hand-object interaction actions in egocentric videos, achieving high accuracy in a competitive challenge.

Contribution

It introduces a novel Transformer-based keypoint estimator for hand and object detection in egocentric scenarios, improving action recognition performance.

Findings

01

Achieved 87.19% top-1 accuracy on the test set.

02

Outperformed other methods in the ECCV 2022 challenge.

03

Demonstrated effectiveness of Transformer architecture for keypoint estimation.

Abstract

This report describes the 2nd place solution to the ECCV 2022 Human Body, Hands, and Activities (HBHA) from Egocentric and Multi-view Cameras Challenge: Action Recognition. This challenge aims to recognize hand-object interaction in an egocentric view. We propose a framework that estimates keypoints of two hands and an object with a Transformer-based keypoint estimator and recognizes actions based on the estimated keypoints. We achieved a top-1 accuracy of 87.19% on the testset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · COVID-19 diagnosis using AI