Large Language Models Powered Context-aware Motion Prediction in   Autonomous Driving

Xiaoji Zheng; Lixiu Wu; Zhijie Yan; Yuanrong Tang; Hao Zhao; Chen; Zhong; Bokui Chen; and Jiangtao Gong

arXiv:2403.11057·cs.CV·July 31, 2024·1 cites

Large Language Models Powered Context-aware Motion Prediction in Autonomous Driving

Xiaoji Zheng, Lixiu Wu, Zhijie Yan, Yuanrong Tang, Hao Zhao, Chen, Zhong, Bokui Chen, and Jiangtao Gong

PDF

Open Access 2 Repos

TL;DR

This paper leverages Large Language Models to improve traffic scene understanding and motion prediction accuracy in autonomous driving by integrating rich context information through prompt engineering and a cost-effective deployment strategy.

Contribution

It introduces a novel method of using LLMs with prompt engineering to enhance traffic context understanding for motion prediction in autonomous driving.

Findings

01

LLMs can effectively visualize complex traffic environments into image prompts.

02

Integrating LLM-derived context improves motion prediction accuracy.

03

A cost-effective strategy enhances prediction performance with minimal LLM-augmented data.

Abstract

Motion prediction is among the most fundamental tasks in autonomous driving. Traditional methods of motion forecasting primarily encode vector information of maps and historical trajectory data of traffic participants, lacking a comprehensive understanding of overall traffic semantics, which in turn affects the performance of prediction tasks. In this paper, we utilized Large Language Models (LLMs) to enhance the global traffic context understanding for motion prediction tasks. We first conducted systematic prompt engineering, visualizing complex traffic environments and historical trajectory information of traffic participants into image prompts -- Transportation Context Map (TC-Map), accompanied by corresponding text prompts. Through this approach, we obtained rich traffic context information from the LLM. By integrating this information into the motion prediction model, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications