Multimodal LLM for Intelligent Transportation Systems
Dexter Le, Aybars Yunusoglu, Karn Tiwari, Murat Isik, I. Can Dikmen

TL;DR
This paper presents a novel LLM-based framework for multimodal data analysis in transportation, improving decision-making efficiency and accuracy across diverse sensor datasets using a unified architecture.
Contribution
Introduces a 3D framework utilizing a single LLM architecture to analyze multimodal transportation data, reducing complexity and enhancing performance.
Findings
Achieves 91.33% average accuracy across datasets
Handles time-series, images, and videos effectively
Demonstrates versatility of LLMs in transportation applications
Abstract
In the evolving landscape of transportation systems, integrating Large Language Models (LLMs) offers a promising frontier for advancing intelligent decision-making across various applications. This paper introduces a novel 3-dimensional framework that encapsulates the intersection of applications, machine learning methodologies, and hardware devices, particularly emphasizing the role of LLMs. Instead of using multiple machine learning algorithms, our framework uses a single, data-centric LLM architecture that can analyze time series, images, and videos. We explore how LLMs can enhance data interpretation and decision-making in transportation. We apply this LLM framework to different sensor datasets, including time-series data and visual data from sources like Oxford Radar RobotCar, D-Behavior (D-Set), nuScenes by Motional, and Comma2k19. The goal is to streamline data processing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies
MethodsALIGN
