Fusing Hand and Body Skeletons for Human Action Recognition in Assembly
Dustin Aganian, Mona K\"ohler, Benedict Stephan, Markus Eisenbach,, Horst-Michael Gross

TL;DR
This paper introduces a method that combines coarse body skeletons with detailed hand skeletons using CNNs and transformers to improve human action recognition in assembly tasks for collaborative robots.
Contribution
It proposes a novel fusion approach of body and hand skeletons with attention mechanisms, enhancing recognition accuracy for assembly actions.
Findings
Improved action recognition accuracy in assembly scenarios.
Transformers effectively fuse information from body and hand skeletons.
Demonstrated effectiveness of the approach in real-world tasks.
Abstract
As collaborative robots (cobots) continue to gain popularity in industrial manufacturing, effective human-robot collaboration becomes crucial. Cobots should be able to recognize human actions to assist with assembly tasks and act autonomously. To achieve this, skeleton-based approaches are often used due to their ability to generalize across various people and environments. Although body skeleton approaches are widely used for action recognition, they may not be accurate enough for assembly actions where the worker's fingers and hands play a significant role. To address this limitation, we propose a method in which less detailed body skeletons are combined with highly detailed hand skeletons. We investigate CNNs and transformers, the latter of which are particularly adept at extracting and combining important information from both skeleton types using attention. This paper demonstrates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Robot Manipulation and Learning · Hand Gesture Recognition Systems
