Inverse Reinforcement Learning for Marketing
Igor Halperin

TL;DR
This paper introduces an IRL-based method for modeling consumer preferences in marketing, offering a tractable alternative to traditional utility estimation and highlighting the impact of observational noise on demand analysis.
Contribution
It develops a maximum entropy IRL approach for dynamic consumer demand, providing a low-dimensional convex optimization framework for parameter estimation.
Findings
IRL can effectively model consumer preferences from observed behavior.
Observational noise can mimic consumer heterogeneity in demand data.
The proposed method simplifies the estimation process in marketing models.
Abstract
Learning customer preferences from an observed behaviour is an important topic in the marketing literature. Structural models typically model forward-looking customers or firms as utility-maximizing agents whose utility is estimated using methods of Stochastic Optimal Control. We suggest an alternative approach to study dynamic consumer demand, based on Inverse Reinforcement Learning (IRL). We develop a version of the Maximum Entropy IRL that leads to a highly tractable model formulation that amounts to low-dimensional convex optimization in the search for optimal model parameters. Using simulations of consumer demand, we show that observational noise for identical customers can be easily confused with an apparent consumer heterogeneity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
