Large-scale Multi-modal Person Identification in Real Unconstrained Environments
Jiajie Ye, Yisheng Guan, Junfa Liu, Xinghong Huang, Hong Zhang

TL;DR
This paper addresses the challenge of person identification in noisy, real-world environments by proposing a multi-modal feature fusion framework using deep learning to improve accuracy over traditional single-modal methods.
Contribution
It introduces a novel fusion module that combines multiple feature types to enhance person identification accuracy in unconstrained settings.
Findings
Improved identification accuracy with multi-modal fusion.
Effective decision and feature layer fusion strategies.
Robust performance in real-world noisy environments.
Abstract
Person identification (P-ID) under real unconstrained noisy environments is a huge challenge. In multiple-feature learning with Deep Convolutional Neural Networks (DCNNs) or Machine Learning method for large-scale person identification in the wild, the key is to design an appropriate strategy for decision layer fusion or feature layer fusion which can enhance discriminative power. It is necessary to extract different types of valid features and establish a reasonable framework to fuse different types of information. In traditional methods, different persons are identified based on single modal features to identify, such as face feature, audio feature, and head feature. These traditional methods cannot realize a highly accurate level of person identification in real unconstrained environments. The study aims to propose a fusion module to fuse multi-modal features for person…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Face recognition and analysis · Gait Recognition and Analysis
