GPT-Driver: Learning to Drive with GPT
Jiageng Mao, Yuxi Qian, Junjie Ye, Hang Zhao, Yue Wang

TL;DR
GPT-Driver transforms GPT-3.5 into a motion planner for autonomous vehicles by reformulating trajectory planning as a language modeling task, leveraging LLM reasoning and prompting strategies to improve generalization and interpretability.
Contribution
This paper introduces a novel approach that uses large language models for motion planning in autonomous driving, a perspective not previously explored.
Findings
Effective trajectory generation on nuScenes dataset
Enhanced generalization to unseen scenarios
Improved interpretability of decision-making process
Abstract
We present a simple yet effective approach that can transform the OpenAI GPT-3.5 model into a reliable motion planner for autonomous vehicles. Motion planning is a core challenge in autonomous driving, aiming to plan a driving trajectory that is safe and comfortable. Existing motion planners predominantly leverage heuristic methods to forecast driving trajectories, yet these approaches demonstrate insufficient generalization capabilities in the face of novel and unseen driving scenarios. In this paper, we propose a novel approach to motion planning that capitalizes on the strong reasoning capabilities and generalization potential inherent to Large Language Models (LLMs). The fundamental insight of our approach is the reformulation of motion planning as a language modeling problem, a perspective not previously explored. Specifically, we represent the planner inputs and outputs as…
Peer Reviews
Decision·Submitted to ICLR 2024
- A very relevant problem being evaluated. - Interesting and novel approach being proposed. - Promising experimental results.
- The method does not seem to be very feasible for online execution. - The method seems to critically depend on the existing SOTA methods as its integral part, making the overall system quite complex. - The experimental section can be improved.
1. The proposed method is simple, straightforward, and well-performing in the studied driving scenarios. 2. It provides informative insights on the feasibility and performance of steering LLMs into motion planners producing numerical waypoints. It is particularly interesting and promising that the authors show that GPT-Driver can outperform existing approaches after few-shot fine-tuning.
While the proposed GPT-Driver has demonstrated impressive performance, the paper lacks in-depth analysis to help the audience gain a deeper understanding of the model's performance and limitations: 1. For example, an ablation study should be conducted to evaluate the benefit of having chain-of-thought reasoning in the LLM's output. While it is well-known that chain-of-thought reasoning boosts LLM's performance, it is worth evaluating its contribution to the motion planning task. 2. Also, ther
- How to best utilize LLMs for autonomous vehicles is an interesting and timely question that I think is of great interest to the community. - That the authors achieved such a huge improvement with fine tuning is a surprising insight (going from worst to best results), but potentially useful if it checks out.
- The benchmark comparison seems a bit apples to oranges. You compare against end-to-end computer vision approaches from CVPR, but your approach assumes the object detections are given and only solves the planning part. Even if you use the detections from a competing CV method, it is unclear to me how strong this result is as planning is perhaps not the main focus of their approach. It would have been good to have some conventional planning stack as baseline as well. - Your portrayal of related
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Multimodal Machine Learning Applications · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Linear Warmup With Cosine Annealing · Layer Normalization · Softmax · Byte Pair Encoding
