GlovEgo-HOI: Bridging the Synthetic-to-Real Gap for Industrial Egocentric Human-Object Interaction Detection
Alfio Spoto, Rosario Leonardi, Francesco Ragusa, and Giovanni Maria Farinella

TL;DR
This paper introduces GlovEgo-HOI, a new dataset and model for industrial egocentric human-object interaction detection, utilizing synthetic data augmentation and hand pose information to improve robustness in safety-critical environments.
Contribution
The paper presents a novel synthetic data augmentation framework, a new industrial EHOI dataset, and a model that leverages hand pose cues for better interaction detection.
Findings
Synthetic data augmentation improves model robustness.
GlovEgo-Net outperforms baseline methods.
Public release of dataset and models facilitates future research.
Abstract
Egocentric Human-Object Interaction (EHOI) analysis is crucial for industrial safety, yet the development of robust models is hindered by the scarcity of annotated domain-specific data. We address this challenge by introducing a data generation framework that combines synthetic data with a diffusion-based process to augment real-world images with realistic Personal Protective Equipment (PPE). We present GlovEgo-HOI, a new benchmark dataset for industrial EHOI, and GlovEgo-Net, a model integrating Glove-Head and Keypoint- Head modules to leverage hand pose information for enhanced interaction detection. Extensive experiments demonstrate the effectiveness of the proposed data generation framework and GlovEgo-Net. To foster further research, we release the GlovEgo-HOI dataset, augmentation pipeline, and pre-trained models at: GitHub project.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Robot Manipulation and Learning
