iSafetyBench: A video-language benchmark for safety in industrial environment

Raiyaan Abdullah; Yogesh Singh Rawat; and Shruti Vyas

arXiv:2508.00399·cs.CV·August 15, 2025

iSafetyBench: A video-language benchmark for safety in industrial environment

Raiyaan Abdullah, Yogesh Singh Rawat, and Shruti Vyas

PDF

1 Datasets

TL;DR

iSafetyBench is a new video-language benchmark designed to evaluate AI models' ability to recognize routine and hazardous activities in industrial environments, highlighting current models' limitations in safety-critical scenarios.

Contribution

The paper introduces iSafetyBench, a comprehensive dataset and benchmark for assessing vision-language models in industrial safety contexts, addressing a gap in existing video understanding benchmarks.

Findings

01

State-of-the-art models perform poorly on hazardous activity recognition.

02

Models struggle with multi-label and safety-critical scenarios.

03

Significant performance gaps highlight the need for safety-aware models.

Abstract

Recent advances in vision-language models (VLMs) have enabled impressive generalization across diverse video understanding tasks under zero-shot settings. However, their capabilities in high-stakes industrial domains-where recognizing both routine operations and safety-critical anomalies is essential-remain largely underexplored. To address this gap, we introduce iSafetyBench, a new video-language benchmark specifically designed to evaluate model performance in industrial environments across both normal and hazardous scenarios. iSafetyBench comprises 1,100 video clips sourced from real-world industrial settings, annotated with open-vocabulary, multi-label action tags spanning 98 routine and 67 hazardous action categories. Each clip is paired with multiple-choice questions for both single-label and multi-label evaluation, enabling fine-grained assessment of VLMs in both standard and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

raiyaanabdullah/isafety-bench
dataset· 351 dl
351 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.