SuPEr-SAM: Using the Supervision Signal from a Pose Estimator to Train a Spatial Attention Module for Personal Protective Equipment Recognition
Adrian Sandru, Georgian-Emilian Duta, Mariana-Iuliana Georgescu, Radu, Tudor Ionescu

TL;DR
This paper introduces a deep learning approach for PPE detection that leverages pose estimation during training to enhance a spatial attention classifier, achieving accurate PPE recognition with minimal inference overhead.
Contribution
It presents a novel training method that uses pose supervision to improve a spatial attention module for PPE detection, reducing computational costs during inference.
Findings
Improved PPE detection accuracy over baseline methods
Effective use of pose supervision during training
Minimal additional computational overhead during inference
Abstract
We propose a deep learning method to automatically detect personal protective equipment (PPE), such as helmets, surgical masks, reflective vests, boots and so on, in images of people. Typical approaches for PPE detection based on deep learning are (i) to train an object detector for items such as those listed above or (ii) to train a person detector and a classifier that takes the bounding boxes predicted by the detector and discriminates between people wearing and people not wearing the corresponding PPE items. We propose a novel and accurate approach that uses three components: a person detector, a body pose estimator and a classifier. Our novelty consists in using the pose estimator only at training time, to improve the prediction performance of the classifier. We modify the neural architecture of the classifier by adding a spatial attention mechanism, which is trained using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
