Forecast-MAE: Self-supervised Pre-training for Motion Forecasting with   Masked Autoencoders

Jie Cheng; Xiaodong Mei; Ming Liu

arXiv:2308.09882·cs.RO·August 22, 2023

Forecast-MAE: Self-supervised Pre-training for Motion Forecasting with Masked Autoencoders

Jie Cheng, Xiaodong Mei, Ming Liu

PDF

Open Access 2 Repos

TL;DR

Forecast-MAE introduces a self-supervised learning framework using masked autoencoders for motion forecasting, leveraging agent and road network interconnections, achieving competitive results without heavy supervision.

Contribution

It presents a novel self-supervised pre-training method for motion forecasting using mask autoencoders tailored for trajectory and road network data.

Findings

01

Outperforms previous self-supervised methods significantly.

02

Achieves competitive performance with state-of-the-art supervised models.

03

Utilizes minimal inductive bias with standard Transformer architecture.

Abstract

This study explores the application of self-supervised learning (SSL) to the task of motion forecasting, an area that has not yet been extensively investigated despite the widespread success of SSL in computer vision and natural language processing. To address this gap, we introduce Forecast-MAE, an extension of the mask autoencoders framework that is specifically designed for self-supervised learning of the motion forecasting task. Our approach includes a novel masking strategy that leverages the strong interconnections between agents' trajectories and road networks, involving complementary masking of agents' future or history trajectories and random masking of lane segments. Our experiments on the challenging Argoverse 2 motion forecasting benchmark show that Forecast-MAE, which utilizes standard Transformer blocks with minimal inductive bias, achieves competitive performance compared…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTraffic Prediction and Management Techniques · Autonomous Vehicle Technology and Safety · Traffic and Road Safety

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Label Smoothing · Layer Normalization · Softmax · Dense Connections