On the impact of MDP design for Reinforcement Learning agents in   Resource Management

Renato Luiz de Freitas Cunha; Luiz Chaimowicz

arXiv:2109.03202·cs.AI·September 8, 2021

On the impact of MDP design for Reinforcement Learning agents in Resource Management

Renato Luiz de Freitas Cunha, Luiz Chaimowicz

PDF

Open Access

TL;DR

This paper empirically analyzes how different Markov Decision Process (MDP) designs affect Reinforcement Learning agents' performance in resource management, highlighting the importance of state representation and transferability.

Contribution

It compares four MDP variations, demonstrating that compact state representations enable effective transfer of agents across environments with minimal retraining.

Findings

01

Compact state representations improve transferability.

02

Transferred agents outperform specialized ones in 80% of scenarios.

03

MDP design choices significantly impact agent performance.

Abstract

The recent progress in Reinforcement Learning applications to Resource Management presents MDPs without a deeper analysis of the impacts of design decisions on agent performance. In this paper, we compare and contrast four different MDP variations, discussing their computational requirements and impacts on agent performance by means of an empirical analysis. We conclude by showing that, in our experiments, when using Multi-Layer Perceptrons as approximation function, a compact state representation allows transfer of agents between environments, and that transferred agents have good performance and outperform specialized agents in 80\% of the tested scenarios, even without retraining.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuction Theory and Applications · Optimization and Search Problems · Reinforcement Learning in Robotics