UAHOI: Uncertainty-aware Robust Interaction Learning for HOI Detection
Mu Chen, Minghan Chen, Yi Yang

TL;DR
This paper introduces UAHOI, an uncertainty-aware approach for HOI detection that estimates prediction confidence to improve accuracy and robustness in complex interaction scenarios.
Contribution
The paper proposes a novel uncertainty estimation method integrated into HOI detection models, enhancing prediction confidence and detection performance without hand-designed components.
Findings
Improved accuracy on V-COCO and HICO-DET benchmarks.
Enhanced robustness in complex interaction scenarios.
Effective uncertainty modeling improves detection confidence.
Abstract
This paper focuses on Human-Object Interaction (HOI) detection, addressing the challenge of identifying and understanding the interactions between humans and objects within a given image or video frame. Spearheaded by Detection Transformer (DETR), recent developments lead to significant improvements by replacing traditional region proposals by a set of learnable queries. However, despite the powerful representation capabilities provided by Transformers, existing Human-Object Interaction (HOI) detection methods still yield low confidence levels when dealing with complex interactions and are prone to overlooking interactive actions. To address these issues, we propose a novel approach \textsc{UAHOI}, Uncertainty-aware Robust Human-Object Interaction Learning that explicitly estimates prediction uncertainty during the training process to refine both detection and interaction predictions.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Context-Aware Activity Recognition Systems · Human Pose and Action Recognition
MethodsAttention Is All You Need · Sparse Evolutionary Training · Linear Layer · Layer Normalization · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam · Byte Pair Encoding · Softmax · Absolute Position Encodings
