Attention Augmented ConvLSTM for Environment Prediction

Bernard Lange; Masha Itkina; Mykel J. Kochenderfer

arXiv:2010.09662·cs.CV·September 14, 2021

Attention Augmented ConvLSTM for Environment Prediction

Bernard Lange, Masha Itkina, Mykel J. Kochenderfer

PDF

Open Access 1 Repo

TL;DR

This paper introduces two attention-augmented ConvLSTM models that significantly improve environment prediction accuracy in robotic systems by reducing blurring and preserving moving objects, demonstrated on real-world datasets.

Contribution

The paper proposes TAAConvLSTM and SAAConvLSTM, novel extensions to ConvLSTM incorporating attention mechanisms for better spatiotemporal environment prediction.

Findings

01

Improved prediction accuracy on KITTI and Waymo datasets.

02

Reduced blurring and better object preservation in predictions.

03

Enhanced suitability for safety-critical robotic applications.

Abstract

Safe and proactive planning in robotic systems generally requires accurate predictions of the environment. Prior work on environment prediction applied video frame prediction techniques to bird's-eye view environment representations, such as occupancy grids. ConvLSTM-based frameworks used previously often result in significant blurring and vanishing of moving objects, thus hindering their applicability for use in safety-critical applications. In this work, we propose two extensions to the ConvLSTM to address these issues. We present the Temporal Attention Augmented ConvLSTM (TAAConvLSTM) and Self-Attention Augmented ConvLSTM (SAAConvLSTM) frameworks for spatiotemporal occupancy prediction, and demonstrate improved performance over baseline architectures on the real-world KITTI and Waymo datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sisl/AttentionAugmentedConvLSTM
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Video Surveillance and Tracking Methods · Human Pose and Action Recognition

MethodsSigmoid Activation · Convolution · Tanh Activation · ConvLSTM