KLDrive: Fine-Grained 3D Scene Reasoning for Autonomous Driving based on Knowledge Graph
Ye Tian, Jingyi Zhang, Zihao Wang, Xiaoyuan Ren, Xiaofan Yu, Onat Gungor, Tajana Rosing

TL;DR
KLDrive introduces a novel framework combining knowledge graphs and large language models to improve fine-grained reasoning and question answering in autonomous driving, reducing hallucinations and enhancing accuracy.
Contribution
It is the first to integrate a scene fact construction module with an LLM for reliable reasoning in autonomous driving scenarios.
Findings
Achieves 65.04% accuracy on NuScenes-QA
Attains 42.45 SPICE score on GVQA
Reduces hallucinations in factual reasoning tasks
Abstract
Autonomous driving requires reliable reasoning over fine-grained 3D scene facts. Fine-grained question answering over multi-modal driving observations provides a natural way to evaluate this capability, yet existing perception pipelines and driving-oriented large language model (LLM) methods still suffer from unreliable scene facts, hallucinations, opaque reasoning, and heavy reliance on task-specific training. We present KLDrive, the first knowledge-graph-augmented LLM reasoning framework for fine-grained question answering in autonomous driving. KLDrive addresses this problem through designing two tightly coupled components: an energy-based scene fact construction module that consolidates multi-source evidence into a reliable scene knowledge graph, and an LLM agent that performs fact-grounded reasoning over a constrained action space under explicit structural constraints. By combining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Advanced Neural Network Applications
