RATM: Recurrent Attentive Tracking Model

Samira Ebrahimi Kahou; Vincent Michalski; Roland Memisevic

arXiv:1510.08660·cs.LG·April 29, 2016

RATM: Recurrent Attentive Tracking Model

Samira Ebrahimi Kahou, Vincent Michalski, Roland Memisevic

PDF

TL;DR

The paper introduces RATM, a modular neural network with soft attention for object tracking, capable of focusing on relevant image regions and generalizing across different datasets.

Contribution

It presents a novel recurrent attention framework that integrates attention control, feature extraction, and learning objectives for improved tracking performance.

Findings

01

Performs well on bouncing ball, moving digits, and KTH datasets.

02

Generalizes to unseen sequences in a challenging tracking dataset.

03

Uses gradient descent for training the attention mechanism.

Abstract

We present an attention-based modular neural framework for computer vision. The framework uses a soft attention mechanism allowing models to be trained with gradient descent. It consists of three modules: a recurrent attention module controlling where to look in an image or video frame, a feature-extraction module providing a representation of what is seen, and an objective module formalizing why the model learns its attentive behavior. The attention module allows the model to focus computation on task-related information in the input. We apply the framework to several object tracking tasks and explore various design choices. We experiment with three data sets, bouncing ball, moving digits and the real-world KTH data set. The proposed Recurrent Attentive Tracking Model performs well on all three tasks and can generalize to related but previously unseen sequences from a challenging…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.