Evaluating Retrieval-Augmented Generation Strategies for Large Language Models in Travel Mode Choice Prediction

Yiming Xu; Junfeng Jiao

arXiv:2508.17527·cs.AI·August 26, 2025

Evaluating Retrieval-Augmented Generation Strategies for Large Language Models in Travel Mode Choice Prediction

Yiming Xu, Junfeng Jiao

PDF

TL;DR

This paper investigates how retrieval-augmented large language models can improve travel mode choice prediction accuracy and generalization, surpassing traditional models by integrating empirical data with advanced retrieval strategies.

Contribution

It introduces a modular RAG framework for LLMs in travel prediction and evaluates multiple retrieval strategies across different LLM architectures.

Findings

01

RAG significantly improves prediction accuracy.

02

GPT-4o with balanced retrieval and re-ranking achieves 80.8% accuracy.

03

LLMs outperform traditional statistical and machine learning models.

Abstract

Accurately predicting travel mode choice is essential for effective transportation planning, yet traditional statistical and machine learning models are constrained by rigid assumptions, limited contextual reasoning, and reduced generalizability. This study explores the potential of Large Language Models (LLMs) as a more flexible and context-aware approach to travel mode choice prediction, enhanced by Retrieval-Augmented Generation (RAG) to ground predictions in empirical data. We develop a modular framework for integrating RAG into LLM-based travel mode choice prediction and evaluate four retrieval strategies: basic RAG, RAG with balanced retrieval, RAG with a cross-encoder for re-ranking, and RAG with balanced retrieval and cross-encoder for re-ranking. These strategies are tested across three LLM architectures (OpenAI GPT-4o, o4-mini, and o3) to examine the interaction between model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.