ParkPredict+: Multimodal Intent and Motion Prediction for Vehicles in   Parking Lots with CNN and Transformer

Xu Shen; Matthew Lacayo; Nidhir Guggilla; Francesco Borrelli

arXiv:2204.10777·cs.CV·January 12, 2023

ParkPredict+: Multimodal Intent and Motion Prediction for Vehicles in Parking Lots with CNN and Transformer

Xu Shen, Matthew Lacayo, Nidhir Guggilla, Francesco Borrelli

PDF

1 Repo

TL;DR

This paper introduces ParkPredict+, a multimodal vehicle intent and trajectory prediction model for parking lots using CNN and Transformer networks, leveraging a new 4K parking lot dataset.

Contribution

It presents a novel multimodal prediction approach combining CNN and Transformer architectures and introduces the first public high-resolution parking lot driving dataset.

Findings

01

Outperforms existing models in accuracy

02

Supports arbitrary number of modes and complex multi-agent scenarios

03

Adapts to different parking map layouts

Abstract

The problem of multimodal intent and trajectory prediction for human-driven vehicles in parking lots is addressed in this paper. Using models designed with CNN and Transformer networks, we extract temporal-spatial and contextual information from trajectory history and local bird's eye view (BEV) semantic images, and generate predictions about intent distribution and future trajectory sequences. Our methods outperform existing models in accuracy, while allowing an arbitrary number of modes, encoding complex multi-agent scenarios, and adapting to different parking maps. To train and evaluate our method, we present the first public 4K video dataset of human driving in parking lots with accurate annotation, high frame rate, and rich traffic scenarios.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xushenlz/parksim
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Dense Connections · Softmax · Label Smoothing · Dropout · Adam