Follow your Nose: Using General Value Functions for Directed Exploration in Reinforcement Learning
Durgesh Kalwar, Omkar Shelke, Somjit Nath, Hardik Meisheri, Harshad, Khadilkar

TL;DR
This paper introduces a method combining General Value Functions with directed exploration to improve sample efficiency in large, sparse reward environments, demonstrating superior navigation performance.
Contribution
It proposes a novel approach integrating GVFs with exploration strategies to address both large scale and reward sparsity simultaneously.
Findings
Outperforms baseline methods in navigation tasks
Enhances sample efficiency in large state spaces
Shows robustness across different grid sizes
Abstract
Improving sample efficiency is a key challenge in reinforcement learning, especially in environments with large state spaces and sparse rewards. In literature, this is resolved either through the use of auxiliary tasks (subgoals) or through clever exploration strategies. Exploration methods have been used to sample better trajectories in large environments while auxiliary tasks have been incorporated where the reward is sparse. However, few studies have attempted to tackle both large scale and reward sparsity at the same time. This paper explores the idea of combining exploration with auxiliary task learning using General Value Functions (GVFs) and a directed exploration strategy. We present a way to learn value functions which can be used to sample actions and provide directed exploration. Experiments on navigation tasks with varying grid sizes demonstrate the performance advantages…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Advanced Multi-Objective Optimization Algorithms
