Bellman Meets Hawkes: Model-Based Reinforcement Learning via Temporal   Point Processes

Chao Qu; Xiaoyu Tan; Siqiao Xue; Xiaoming Shi; James Zhang; Hongyuan; Mei

arXiv:2201.12569·cs.LG·December 29, 2022

Bellman Meets Hawkes: Model-Based Reinforcement Learning via Temporal Point Processes

Chao Qu, Xiaoyu Tan, Siqiao Xue, Xiaoming Shi, James Zhang, Hongyuan, Mei

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel model-based reinforcement learning framework that uses Hawkes processes to handle asynchronous stochastic events in continuous time, optimizing intervention policies for long-term rewards.

Contribution

It develops a new approach integrating Hawkes processes into the Bellman equation for continuous-time decision making, addressing a gap in RL for event-driven environments.

Findings

01

Outperforms existing methods in synthetic simulations.

02

Effective in real-world social media and finance scenarios.

03

Demonstrates improved long-term reward optimization.

Abstract

We consider a sequential decision making problem where the agent faces the environment characterized by the stochastic discrete events and seeks an optimal intervention policy such that its long-term reward is maximized. This problem exists ubiquitously in social media, finance and health informatics but is rarely investigated by the conventional research in reinforcement learning. To this end, we present a novel framework of the model-based reinforcement learning where the agent's actions and observations are asynchronous stochastic discrete events occurring in continuous-time. We model the dynamics of the environment by Hawkes process with external intervention control term and develop an algorithm to embed such process in the Bellman equation which guides the direction of the value gradient. We demonstrate the superiority of our method in both synthetic simulator and real-world…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

williambug/event_driven_rl
tfOfficial

Videos

Bellman Meets Hawkes: Model-Based Reinforcement Learning via Temporal Point Processes· underline

Taxonomy

TopicsPoint processes and geometric inequalities · Diffusion and Search Dynamics