Follow your Nose: Using General Value Functions for Directed Exploration   in Reinforcement Learning

Durgesh Kalwar; Omkar Shelke; Somjit Nath; Hardik Meisheri; Harshad; Khadilkar

arXiv:2203.00874·cs.LG·February 28, 2023

Follow your Nose: Using General Value Functions for Directed Exploration in Reinforcement Learning

Durgesh Kalwar, Omkar Shelke, Somjit Nath, Hardik Meisheri, Harshad, Khadilkar

PDF

Open Access

TL;DR

This paper introduces a method combining General Value Functions with directed exploration to improve sample efficiency in large, sparse reward environments, demonstrating superior navigation performance.

Contribution

It proposes a novel approach integrating GVFs with exploration strategies to address both large scale and reward sparsity simultaneously.

Findings

01

Outperforms baseline methods in navigation tasks

02

Enhances sample efficiency in large state spaces

03

Shows robustness across different grid sizes

Abstract

Improving sample efficiency is a key challenge in reinforcement learning, especially in environments with large state spaces and sparse rewards. In literature, this is resolved either through the use of auxiliary tasks (subgoals) or through clever exploration strategies. Exploration methods have been used to sample better trajectories in large environments while auxiliary tasks have been incorporated where the reward is sparse. However, few studies have attempted to tackle both large scale and reward sparsity at the same time. This paper explores the idea of combining exploration with auxiliary task learning using General Value Functions (GVFs) and a directed exploration strategy. We present a way to learn value functions which can be used to sample actions and provide directed exploration. Experiments on navigation tasks with varying grid sizes demonstrate the performance advantages…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Advanced Multi-Objective Optimization Algorithms