CostNet: An End-to-End Framework for Goal-Directed Reinforcement Learning
Per-Arne Andersen, Morten Goodwin, Ole-Christoffer Granmo

TL;DR
CostNet introduces a novel goal-directed reinforcement learning algorithm that predicts state distances to serve as intrinsic rewards, achieving comparable performance to model-free RL with improved sample efficiency.
Contribution
The paper presents a new distance-predicting algorithm for goal-directed RL that enhances sample efficiency by using learned state distances as intrinsic rewards.
Findings
Performs comparably to model-free RL in test environments.
Significantly improves sample efficiency over traditional methods.
Effective in complex environments with challenging reward functions.
Abstract
Reinforcement Learning (RL) is a general framework concerned with an agent that seeks to maximize rewards in an environment. The learning typically happens through trial and error using explorative methods, such as epsilon-greedy. There are two approaches, model-based and model-free reinforcement learning, that show concrete results in several disciplines. Model-based RL learns a model of the environment for learning the policy while model-free approaches are fully explorative and exploitative without considering the underlying environment dynamics. Model-free RL works conceptually well in simulated environments, and empirical evidence suggests that trial and error lead to a near-optimal behavior with enough training. On the other hand, model-based RL aims to be sample efficient, and studies show that it requires far less training in the real environment for learning a good policy. A…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest
