Natural Language Person Search Using Deep Reinforcement Learning

Ankit Shah; Tyler Vuong

arXiv:1809.00365·cs.CV·September 5, 2018

Natural Language Person Search Using Deep Reinforcement Learning

Ankit Shah, Tyler Vuong

PDF

Open Access

TL;DR

This paper proposes a deep reinforcement learning approach for natural language person search that localizes a person in images by iteratively refining bounding boxes based on description and pixel data, aiming for improved efficiency and accuracy.

Contribution

It introduces a constrained deep reinforcement learning method specifically designed for person search, focusing on bounding box refinement guided by natural language descriptions.

Findings

01

Effective localization of persons using RL-based bounding box adjustments

02

Reduced computational resources compared to unconstrained object detection

03

Improved accuracy in person search tasks

Abstract

Recent success in deep reinforcement learning is having an agent learn how to play Go and beat the world champion without any prior knowledge of the game. In that task, the agent has to make a decision on what action to take based on the positions of the pieces. Person Search is recently explored using natural language based text description of images for video surveillance applications (S.Li et.al). We see (Fu.et al) provides an end to end approach for object-based retrieval using deep reinforcement learning without constraints placed on which objects are being detected. However, we believe for real-world applications such as person search defining specific constraints which identify a person as opposed to starting with a general object detection will have benefits in terms of performance and computational resources required. In our task, Deep reinforcement learning would localize the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Human Pose and Action Recognition