Choice-Model-Assisted Q-learning for Delayed-Feedback Revenue Management

Owen Shen; Patrick Jaillet

arXiv:2602.02283·cs.LG·February 3, 2026

Choice-Model-Assisted Q-learning for Delayed-Feedback Revenue Management

Owen Shen, Patrick Jaillet

PDF

Open Access

TL;DR

This paper introduces a choice-model-assisted reinforcement learning approach for revenue management with delayed feedback, demonstrating theoretical convergence and empirical robustness in hotel booking simulations, with benefits and limitations depending on model accuracy.

Contribution

It proposes a fixed-choice-model-assisted Q-learning method for delayed feedback revenue management, providing convergence guarantees and empirical evaluation in real-world-like scenarios.

Findings

01

Converges to near-optimal Q-function with bounded error.

02

Shows robustness to parameter shifts in simulations.

03

Degrades under model misspecification, indicating bias risks.

Abstract

We study reinforcement learning for revenue management with delayed feedback, where a substantial fraction of value is determined by customer cancellations and modifications observed days after booking. We propose \emph{choice-model-assisted RL}: a calibrated discrete choice model is used as a fixed partial world model to impute the delayed component of the learning target at decision time. In the fixed-model deployment regime, we prove that tabular Q-learning with model-imputed targets converges to an $O (ε / (1 - γ))$ neighborhood of the optimal Q-function, where $ε$ summarizes partial-model error, with an additional $O (t^{- 1/2})$ sampling term. Experiments in a simulator calibrated from 61{,}619 hotel bookings (1{,}088 independent runs) show: (i) no statistically detectable difference from a maturity-buffer DQN baseline in stationary settings; (ii) positive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSupply Chain and Inventory Management · Advanced Queuing Theory Analysis · Consumer Market Behavior and Pricing