A general Markov decision process formalism for action-state   entropy-regularized reward maximization

Dmytro Grytskyy; Jorge Ram\'irez-Ruiz; Rub\'en Moreno-Bote

arXiv:2302.01098·cs.LG·February 3, 2023

A general Markov decision process formalism for action-state entropy-regularized reward maximization

Dmytro Grytskyy, Jorge Ram\'irez-Ruiz, Rub\'en Moreno-Bote

PDF

Open Access

TL;DR

This paper introduces a unified dual function formalism for entropy-regularized reward maximization in Markov decision processes, simplifying complex optimization problems across various entropy types.

Contribution

It presents a general convex dual framework that transforms constrained entropy regularization problems into unconstrained convex optimization, encompassing pure and mixed entropy cases.

Findings

01

Unified formalism for action, state, and action-state entropy regularization.

02

Transforms constrained problems into unconstrained convex optimization.

03

Applicable to pure and mixed entropy scenarios.

Abstract

Previous work has separately addressed different forms of action, state and action-state entropy regularization, pure exploration and space occupation. These problems have become extremely relevant for regularization, generalization, speeding up learning and providing robust solutions at unprecedented levels. However, solutions of those problems are hectic, ranging from convex and non-convex optimization, and unconstrained optimization to constrained optimization. Here we provide a general dual function formalism that transforms the constrained optimization problem into an unconstrained convex one for any mixture of action and state entropies. The cases with pure action entropy and pure state entropy are understood as limits of the mixture.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural dynamics and brain function · CCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing