Visual Appearance Based Person Retrieval in Unconstrained Environment Videos
Hiren Galiyawala, Mehul S Raval, Shivansh Dave

TL;DR
This paper presents a method for person retrieval in surveillance videos using soft biometrics like height and clothing attributes, employing adaptive patch extraction and deep learning models to improve accuracy.
Contribution
It introduces an adaptive torso patch extraction and bounding box regression approach combined with fine-tuned deep models for enhanced person retrieval in unconstrained environments.
Findings
Achieves 11.35% improvement over state-of-the-art methods.
Uses Mask R-CNN and DenseNet-169 for detection and classification.
Evaluated on AVSS 2018 dataset with positive results.
Abstract
Visual appearance-based person retrieval is a challenging problem in surveillance. It uses attributes like height, cloth color, cloth type and gender to describe a human. Such attributes are known as soft biometrics. This paper proposes person retrieval from surveillance video using height, torso cloth type, torso cloth color and gender. The approach introduces an adaptive torso patch extraction and bounding box regression to improve the retrieval. The algorithm uses fine-tuned Mask R-CNN and DenseNet-169 for person detection and attribute classification respectively. The performance is analyzed on AVSS 2018 challenge II dataset and it achieves 11.35% improvement over state-of-the-art based on average Intersection over Union measure.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRegion Proposal Network · Softmax · Convolution · RoIAlign · Mask R-CNN
