Fitted Q-Learning for Relational Domains
Srijita Das, Sriraam Natarajan, Kaushik Roy, Ronald Parr, Kristian, Kersting

TL;DR
This paper introduces relational fitted Q-learning algorithms for approximate dynamic programming in relational domains, utilizing gradient boosting to efficiently learn value functions without domain models.
Contribution
It is the first to develop relational fitted Q-learning algorithms, applying gradient boosting to perform Bellman updates in relational settings.
Findings
Performs well on standard domains
Requires fewer training trajectories
Does not rely on domain models
Abstract
We consider the problem of Approximate Dynamic Programming in relational domains. Inspired by the success of fitted Q-learning methods in propositional settings, we develop the first relational fitted Q-learning algorithms by representing the value function and Bellman residuals. When we fit the Q-functions, we show how the two steps of Bellman operator; application and projection steps can be performed using a gradient-boosting technique. Our proposed framework performs reasonably well on standard domains without using domain models and using fewer training trajectories.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Bayesian Modeling and Causal Inference · AI-based Problem Solving and Planning
MethodsQ-Learning
