Delayed Geometric Discounts: An Alternative Criterion for Reinforcement   Learning

Firas Jarboui; Ahmed Akakzia

arXiv:2209.12483·cs.LG·September 27, 2022·1 cites

Delayed Geometric Discounts: An Alternative Criterion for Reinforcement Learning

Firas Jarboui, Ahmed Akakzia

PDF

Open Access

TL;DR

This paper introduces delayed geometric discounts as an alternative to traditional exponential discounts in reinforcement learning, addressing issues with sparse rewards and improving sample efficiency in complex tasks.

Contribution

It generalizes the discounted problem formulation with delayed objective functions, deriving optimal solutions and enhancing exploration and efficiency in RL tasks.

Findings

01

Solved hard exploration problems in tabular environments

02

Improved sample efficiency on robotics benchmarks

03

Addressed limitations of exponential discounting in RL

Abstract

The endeavor of artificial intelligence (AI) is to design autonomous agents capable of achieving complex tasks. Namely, reinforcement learning (RL) proposes a theoretical background to learn optimal behaviors. In practice, RL algorithms rely on geometric discounts to evaluate this optimality. Unfortunately, this does not cover decision processes where future returns are not exponentially less valuable. Depending on the problem, this limitation induces sample-inefficiency (as feed-backs are exponentially decayed) and requires additional curricula/exploration mechanisms (to deal with sparse, deceptive or adversarial rewards). In this paper, we tackle these issues by generalizing the discounted problem formulation with a family of delayed objective functions. We investigate the underlying RL problem to derive: 1) the optimal stationary solution and 2) an approximation of the optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSupply Chain and Inventory Management · Adversarial Robustness in Machine Learning · Blockchain Technology Applications and Security