Recursive Backwards Q-Learning in Deterministic Environments

Jan Diekhoff; J\"orn Fischer

arXiv:2404.15822·cs.AI·April 25, 2024

Recursive Backwards Q-Learning in Deterministic Environments

Jan Diekhoff, J\"orn Fischer

PDF

Open Access

TL;DR

This paper introduces Recursive Backwards Q-Learning (RBQL), a model-based method that efficiently solves deterministic problems by propagating values backwards from terminal states, outperforming standard Q-learning in maze navigation.

Contribution

The paper presents RBQL, a novel recursive, model-based Q-learning algorithm that improves learning speed and accuracy in deterministic environments.

Findings

01

RBQL outperforms standard Q-learning in maze shortest path tasks.

02

RBQL efficiently propagates value information backwards from terminal states.

03

The method significantly reduces learning time in deterministic problems.

Abstract

Reinforcement learning is a popular method of finding optimal solutions to complex problems. Algorithms like Q-learning excel at learning to solve stochastic problems without a model of their environment. However, they take longer to solve deterministic problems than is necessary. Q-learning can be improved to better solve deterministic problems by introducing such a model-based approach. This paper introduces the recursive backwards Q-learning (RBQL) agent, which explores and builds a model of the environment. After reaching a terminal state, it recursively propagates its value backwards through this model. This lets each state be evaluated to its optimal value without a lengthy learning process. In the example of finding the shortest path through a maze, this agent greatly outperforms a regular Q-learning agent.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCognitive Science and Education Research · Experimental Learning in Engineering · Fault Detection and Control Systems

MethodsQ-Learning