Urban-Focused Multi-Task Offline Reinforcement Learning with Contrastive Data Sharing
Xinbo Zhao, Yingxue Zhang, Xin Zhang, Yu Yang, Yiqun Xie, Yanhua Li,, Jun Luo

TL;DR
This paper introduces MODA, a multi-task offline reinforcement learning framework that leverages contrastive data sharing and a robust model-based approach to improve urban decision-making from limited and heterogeneous data.
Contribution
MODA is the first to combine contrastive data sharing with a robust model-based multi-task offline RL algorithm for urban applications.
Findings
MODA outperforms state-of-the-art baselines in real-world urban tasks.
Contrastive data sharing enhances data efficiency and task performance.
The robust MDP construction improves policy stability and generalization.
Abstract
Enhancing diverse human decision-making processes in an urban environment is a critical issue across various applications, including ride-sharing vehicle dispatching, public transportation management, and autonomous driving. Offline reinforcement learning (RL) is a promising approach to learn and optimize human urban strategies (or policies) from pre-collected human-generated spatial-temporal urban data. However, standard offline RL faces two significant challenges: (1) data scarcity and data heterogeneity, and (2) distributional shift. In this paper, we introduce MODA -- a Multi-Task Offline Reinforcement Learning with Contrastive Data Sharing approach. MODA addresses the challenges of data scarcity and heterogeneity in a multi-task urban setting through Contrastive Data Sharing among tasks. This technique involves extracting latent representations of human behaviors by contrasting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic control and management · Transportation and Mobility Innovations
