A Novel Update Mechanism for Q-Networks Based On Extreme Learning Machines
Callum Wilson, Annalisa Riccardi, Edmondo Minisci

TL;DR
This paper introduces Extreme Q-Learning Machine (EQLM), a new reinforcement learning update method based on Extreme Learning Machines, demonstrating comparable performance to traditional Q-Networks on the cart-pole benchmark.
Contribution
The paper proposes a novel update mechanism for Q-learning using Extreme Learning Machines, expanding the methods available beyond gradient-based neural network training.
Findings
EQLM achieves similar long-term performance to Q-Networks on cart-pole.
EQLM offers a potentially faster training alternative to gradient-based updates.
The approach broadens the exploration of update mechanisms in reinforcement learning.
Abstract
Reinforcement learning is a popular machine learning paradigm which can find near optimal solutions to complex problems. Most often, these procedures involve function approximation using neural networks with gradient based updates to optimise weights for the problem being considered. While this common approach generally works well, there are other update mechanisms which are largely unexplored in reinforcement learning. One such mechanism is Extreme Learning Machines. These were initially proposed to drastically improve the training speed of neural networks and have since seen many applications. Here we attempt to apply extreme learning machines to a reinforcement learning problem in the same manner as gradient based updates. This new algorithm is called Extreme Q-Learning Machine (EQLM). We compare its performance to a typical Q-Network on the cart-pole task - a benchmark reinforcement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Advanced Memory and Neural Computing · Adaptive Dynamic Programming Control
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Q-Learning
