Online Multi-modal Person Search in Videos

Jiangyue Xia; Anyi Rao; Qingqiu Huang; Linning Xu; Jiangtao Wen; Dahua; Lin

arXiv:2008.03546·cs.CV·August 11, 2020

Online Multi-modal Person Search in Videos

Jiangyue Xia, Anyi Rao, Qingqiu Huang, Linning Xu, Jiangtao Wen, Dahua, Lin

PDF

Open Access

TL;DR

This paper introduces an online multi-modal person search framework that recognizes individuals in videos in real-time by dynamically updating a multimodal memory bank using reinforcement learning, outperforming existing offline and online methods.

Contribution

The paper presents a novel online person search method with a dynamic multimodal memory bank and reinforcement learning-based update policy, enabling real-time recognition in videos.

Findings

01

Effective in real-time person recognition in videos

02

Outperforms existing online and offline methods

03

Demonstrated on a large movie dataset

Abstract

The task of searching certain people in videos has seen increasing potential in real-world applications, such as video organization and editing. Most existing approaches are devised to work in an offline manner, where identities can only be inferred after an entire video is examined. This working manner precludes such methods from being applied to online services or those applications that require real-time responses. In this paper, we propose an online person search framework, which can recognize people in a video on the fly. This framework maintains a multimodal memory bank at its heart as the basis for person recognition, and updates it dynamically with a policy obtained by reinforcement learning. Our experiments on a large movie dataset show that the proposed method is effective, not only achieving remarkable improvements over online schemes but also outperforming offline methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Face recognition and analysis