Do Math Reasoning LLMs Help Predict the Impact of Public Transit Events?
Bowen Fang, Ruijian Zha, Xuan Di

TL;DR
This paper introduces a novel reinforcement learning approach with shaped rewards for predicting public transit incident durations from noisy, real-world text alerts, outperforming traditional models and math-reasoning LLMs.
Contribution
It adapts RLVR with a tolerance-based reward to improve LLM forecasting in noisy, continuous real-world transit data, bridging a gap in existing methods.
Findings
Shaped reward significantly improves model stability and performance.
Instruction-tuned LLMs outperform math-reasoning models on real-world data.
RLVR achieves a 35% relative improvement in accuracy over baselines.
Abstract
Predicting public transit incident duration from unstructured text alerts is a critical but challenging task. Addressing the domain sparsity of transit operations with standard Supervised Fine-Tuning (SFT) is difficult, as the task involves noisy, continuous labels and lacks reliable expert demonstrations for reasoning. While Reinforcement Learning from Verifiable Rewards (RLVR) excels at tasks with binary correctness, like mathematics, its applicability to noisy, continuous forecasting is an open question. This work, to our knowledge, is the first to bridge the gap between RLVR LLM training with the critical, real-world forecasting challenges in public transit operations. We adapt RLVR to this task by introducing a tolerance-based, shaped reward function that grants partial credit within a continuous error margin, rather than demanding a single correct answer. We systematically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques · Topic Modeling · Multimodal Machine Learning Applications
