GPT-Driver: Learning to Drive with GPT

Jiageng Mao; Yuxi Qian; Junjie Ye; Hang Zhao; Yue Wang

arXiv:2310.01415·cs.CV·December 6, 2023·39 cites

GPT-Driver: Learning to Drive with GPT

Jiageng Mao, Yuxi Qian, Junjie Ye, Hang Zhao, Yue Wang

PDF

Open Access 1 Repo 3 Reviews

TL;DR

GPT-Driver transforms GPT-3.5 into a motion planner for autonomous vehicles by reformulating trajectory planning as a language modeling task, leveraging LLM reasoning and prompting strategies to improve generalization and interpretability.

Contribution

This paper introduces a novel approach that uses large language models for motion planning in autonomous driving, a perspective not previously explored.

Findings

01

Effective trajectory generation on nuScenes dataset

02

Enhanced generalization to unseen scenarios

03

Improved interpretability of decision-making process

Abstract

We present a simple yet effective approach that can transform the OpenAI GPT-3.5 model into a reliable motion planner for autonomous vehicles. Motion planning is a core challenge in autonomous driving, aiming to plan a driving trajectory that is safe and comfortable. Existing motion planners predominantly leverage heuristic methods to forecast driving trajectories, yet these approaches demonstrate insufficient generalization capabilities in the face of novel and unseen driving scenarios. In this paper, we propose a novel approach to motion planning that capitalizes on the strong reasoning capabilities and generalization potential inherent to Large Language Models (LLMs). The fundamental insight of our approach is the reformulation of motion planning as a language modeling problem, a perspective not previously explored. Specifically, we represent the planner inputs and outputs as…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 4

Strengths

- A very relevant problem being evaluated. - Interesting and novel approach being proposed. - Promising experimental results.

Weaknesses

- The method does not seem to be very feasible for online execution. - The method seems to critically depend on the existing SOTA methods as its integral part, making the overall system quite complex. - The experimental section can be improved.

Reviewer 02Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

1. The proposed method is simple, straightforward, and well-performing in the studied driving scenarios. 2. It provides informative insights on the feasibility and performance of steering LLMs into motion planners producing numerical waypoints. It is particularly interesting and promising that the authors show that GPT-Driver can outperform existing approaches after few-shot fine-tuning.

Weaknesses

While the proposed GPT-Driver has demonstrated impressive performance, the paper lacks in-depth analysis to help the audience gain a deeper understanding of the model's performance and limitations: 1. For example, an ablation study should be conducted to evaluate the benefit of having chain-of-thought reasoning in the LLM's output. While it is well-known that chain-of-thought reasoning boosts LLM's performance, it is worth evaluating its contribution to the motion planning task. 2. Also, ther

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

- How to best utilize LLMs for autonomous vehicles is an interesting and timely question that I think is of great interest to the community. - That the authors achieved such a huge improvement with fine tuning is a surprising insight (going from worst to best results), but potentially useful if it checks out.

Weaknesses

- The benchmark comparison seems a bit apples to oranges. You compare against end-to-end computer vision approaches from CVPR, but your approach assumes the object detections are given and only solves the planning part. Even if you use the detections from a competing CV method, it is unclear to me how strong this result is as planning is perhaps not the main focus of their approach. It would have been good to have some conventional planning stack as baseline as well. - Your portrayal of related

Code & Models

Repositories

pointscoder/gpt-driver
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Multimodal Machine Learning Applications · Topic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Linear Warmup With Cosine Annealing · Layer Normalization · Softmax · Byte Pair Encoding