Understanding Driving Risks using Large Language Models: Toward Elderly Driver Assessment

Yuki Yoshihara; Linjing Jiang; Nihan Karatas; Hitoshi Kanamori; Asuka Harada; Takahiro Tanaka

arXiv:2507.08367·cs.CV·July 14, 2025

Understanding Driving Risks using Large Language Models: Toward Elderly Driver Assessment

Yuki Yoshihara, Linjing Jiang, Nihan Karatas, Hitoshi Kanamori, Asuka Harada, Takahiro Tanaka

PDF

TL;DR

This paper explores the use of ChatGPT-4o, a multimodal large language model, to interpret traffic scenes for elderly driver assessment, demonstrating that prompt design significantly influences performance and interpretability.

Contribution

It introduces a novel application of LLMs for traffic scene interpretation in elderly driver assessment, emphasizing prompt strategies and interpretability.

Findings

01

Prompt design greatly impacts model performance.

02

Model achieves high precision but moderate recall in stop-sign detection.

03

Output explanations improve interpretability and trust.

Abstract

This study investigates the potential of a multimodal large language model (LLM), specifically ChatGPT-4o, to perform human-like interpretations of traffic scenes using static dashcam images. Herein, we focus on three judgment tasks relevant to elderly driver assessments: evaluating traffic density, assessing intersection visibility, and recognizing stop signs recognition. These tasks require contextual reasoning rather than simple object detection. Using zero-shot, few-shot, and multi-shot prompting strategies, we evaluated the performance of the model with human annotations serving as the reference standard. Evaluation metrics included precision, recall, and F1-score. Results indicate that prompt design considerably affects performance, with recall for intersection visibility increasing from 21.7% (zero-shot) to 57.0% (multi-shot). For traffic density, agreement increased from 53.5%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.