Tracking as Online Decision-Making: Learning a Policy from Streaming   Videos with Reinforcement Learning

James Steven Supancic III; Deva Ramanan

arXiv:1707.04991·cs.CV·July 18, 2017

Tracking as Online Decision-Making: Learning a Policy from Streaming Videos with Reinforcement Learning

James Steven Supancic III, Deva Ramanan

PDF

TL;DR

This paper models object tracking as an online decision-making task, using deep reinforcement learning to learn policies that decide where to look, when to reinitialize, and when to update, enabling efficient learning from streaming videos.

Contribution

It introduces a novel POMDP formulation for tracking and applies deep reinforcement learning with sparse rewards, allowing scalable training on large streaming video datasets.

Findings

01

Effective learned policies for tracking decisions.

02

Fast training enabled by sparse rewards.

03

Unified evaluation on streaming Internet videos.

Abstract

We formulate tracking as an online decision-making process, where a tracking agent must follow an object despite ambiguous image frames and a limited computational budget. Crucially, the agent must decide where to look in the upcoming frames, when to reinitialize because it believes the target has been lost, and when to update its appearance model for the tracked object. Such decisions are typically made heuristically. Instead, we propose to learn an optimal decision-making policy by formulating tracking as a partially observable decision-making process (POMDP). We learn policies with deep reinforcement learning algorithms that need supervision (a reward signal) only when the track has gone awry. We demonstrate that sparse rewards allow us to quickly train on massive datasets, several orders of magnitude more than past work. Interestingly, by treating the data source of Internet videos…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.