A Unified Perception-Language-Action Framework for Adaptive Autonomous Driving

Yi Zhang; Erik Leo Ha{\ss}; Kuo-Yi Chao; Nenad Petrovic; Yinglei Song; Chengdong Wu; Alois Knoll

arXiv:2507.23540·cs.RO·August 1, 2025

A Unified Perception-Language-Action Framework for Adaptive Autonomous Driving

Yi Zhang, Erik Leo Ha{\ss}, Kuo-Yi Chao, Nenad Petrovic, Yinglei Song, Chengdong Wu, Alois Knoll

PDF

Open Access

TL;DR

This paper introduces a unified perception-language-action framework for autonomous driving that combines multi-sensor data with large language model reasoning to improve adaptability, interpretability, and safety in complex environments.

Contribution

It presents a novel integrated architecture using GPT-4.1 to unify perception, language understanding, and action planning for autonomous vehicles.

Findings

01

Superior trajectory tracking and speed prediction in urban scenarios

02

Enhanced adaptive planning in complex environments

03

Demonstrated improved safety and interpretability

Abstract

Autonomous driving systems face significant challenges in achieving human-like adaptability, robustness, and interpretability in complex, open-world environments. These challenges stem from fragmented architectures, limited generalization to novel scenarios, and insufficient semantic extraction from perception. To address these limitations, we propose a unified Perception-Language-Action (PLA) framework that integrates multi-sensor fusion (cameras, LiDAR, radar) with a large language model (LLM)-augmented Vision-Language-Action (VLA) architecture, specifically a GPT-4.1-powered reasoning core. This framework unifies low-level sensory processing with high-level contextual reasoning, tightly coupling perception with natural language-based semantic understanding and decision-making to enable context-aware, explainable, and safety-bounded autonomous driving. Evaluations on an urban…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Path Planning Algorithms · Robotics and Automated Systems