TransitGPT: A Generative AI-based framework for interacting with GTFS data using Large Language Models
Saipraneeth Devunuri, Lewis Lehe

TL;DR
TransitGPT is a novel framework that uses Large Language Models to enable natural language interaction with GTFS transit data by generating executable Python code, making transit data more accessible without requiring programming expertise.
Contribution
This paper presents TransitGPT, the first LLM-based system that interprets natural language queries into executable code for GTFS data without fine-tuning or direct data access.
Findings
High accuracy in executing diverse transit data queries
Significant improvement in user accessibility to GTFS data
Effective performance demonstrated on benchmark tasks
Abstract
This paper introduces a framework that leverages Large Language Models (LLMs) to answer natural language queries about General Transit Feed Specification (GTFS) data. The framework is implemented in a chatbot called TransitGPT with open-source code. TransitGPT works by guiding LLMs to generate Python code that extracts and manipulates GTFS data relevant to a query, which is then executed on a server where the GTFS feed is stored. It can accomplish a wide range of tasks, including data retrieval, calculations, and interactive visualizations, without requiring users to have extensive knowledge of GTFS or programming. The LLMs that produce the code are guided entirely by prompts, without fine-tuning or access to the actual GTFS feeds. We evaluate TransitGPT using GPT-4o and Claude-3.5-Sonnet LLMs on a benchmark dataset of 100 tasks, to demonstrate its effectiveness and versatility. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems
