B-RIGHT: Benchmark Re-evaluation for Integrity in Generalized Human-Object Interaction Testing
Yoojin Jang, Junsu Kim, Hayeon Kim, Eun-ki Lee, Eun-sol Kim, Seungryul, Baek, Jaejun Yoo

TL;DR
This paper introduces B-RIGHT, a new balanced benchmark dataset for human-object interaction, which improves evaluation fairness by addressing class imbalance and includes a zero-shot test set for unseen scenarios.
Contribution
The paper presents a systematic approach to create a class-balanced HOI dataset and a balanced zero-shot test set, enhancing the reliability of model evaluation.
Findings
Re-evaluation with B-RIGHT reduces score variance.
Model performance rankings change under balanced evaluation.
B-RIGHT provides more reliable and fair comparisons.
Abstract
Human-object interaction (HOI) is an essential problem in artificial intelligence (AI) which aims to understand the visual world that involves complex relationships between humans and objects. However, current benchmarks such as HICO-DET face the following limitations: (1) severe class imbalance and (2) varying number of train and test sets for certain classes. These issues can potentially lead to either inflation or deflation of model performance during evaluation, ultimately undermining the reliability of evaluation scores. In this paper, we propose a systematic approach to develop a new class-balanced dataset, Benchmark Re-evaluation for Integrity in Generalized Human-object Interaction Testing (B-RIGHT), that addresses these imbalanced problems. B-RIGHT achieves class balance by leveraging balancing algorithm and automated generation-and-filtering processes, ensuring an equal number…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning · Context-Aware Activity Recognition Systems
MethodsSparse Evolutionary Training
