The QLBS Q-Learner Goes NuQLear: Fitted Q Iteration, Inverse RL, and   Option Portfolios

Igor Halperin

arXiv:1801.06077·q-fin.CP·January 19, 2018

The QLBS Q-Learner Goes NuQLear: Fitted Q Iteration, Inverse RL, and Option Portfolios

Igor Halperin

PDF

TL;DR

This paper extends the QLBS model by exploring fitted Q iteration, inverse reinforcement learning, and portfolio option pricing, offering data-driven solutions to option hedging, pricing, and the volatility smile problem.

Contribution

It introduces a comprehensive analysis of NuQLear topics within the QLBS framework, including performance benchmarking, IRL adaptation, and portfolio pricing methods.

Findings

01

Fitted Q Iteration performs well compared to DP and BSM models.

02

Inverse RL can infer trader rewards from observed actions.

03

The model enables data-driven pricing of option portfolios.

Abstract

The QLBS model is a discrete-time option hedging and pricing model that is based on Dynamic Programming (DP) and Reinforcement Learning (RL). It combines the famous Q-Learning method for RL with the Black-Scholes (-Merton) model's idea of reducing the problem of option pricing and hedging to the problem of optimal rebalancing of a dynamic replicating portfolio for the option, which is made of a stock and cash. Here we expand on several NuQLear (Numerical Q-Learning) topics with the QLBS model. First, we investigate the performance of Fitted Q Iteration for a RL (data-driven) solution to the model, and benchmark it versus a DP (model-based) solution, as well as versus the BSM model. Second, we develop an Inverse Reinforcement Learning (IRL) setting for the model, where we only observe prices and actions (re-hedges) taken by a trader, but not rewards. Third, we outline how the QLBS model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsQ-Learning