Visualizing the Loss Landscape of Actor Critic Methods with Applications   in Inventory Optimization

Recep Yusuf Bekci; Mehmet G\"um\"u\c{s}

arXiv:2009.02391·cs.LG·September 8, 2020

Visualizing the Loss Landscape of Actor Critic Methods with Applications in Inventory Optimization

Recep Yusuf Bekci, Mehmet G\"um\"u\c{s}

PDF

Open Access

TL;DR

This paper visualizes and analyzes the loss landscapes of actor-critic reinforcement learning methods, demonstrating their characteristics and applying insights to complex inventory optimization problems.

Contribution

It introduces low-dimensional visualizations of actor loss landscapes and applies this approach to multi-store inventory control, linking loss shape to optimal policies.

Findings

01

Loss landscapes vary across algorithms and relate to performance.

02

Visualizations reveal characteristics of the optimization process.

03

Application to inventory control shows the method's practical relevance.

Abstract

Continuous control is a widely applicable area of reinforcement learning. The main players of this area are actor-critic methods that utilize policy gradients of neural approximators as a common practice. The focus of our study is to show the characteristics of the actor loss function which is the essential part of the optimization. We exploit low dimensional visualizations of the loss function and provide comparisons for loss landscapes of various algorithms. Furthermore, we apply our approach to multi-store dynamic inventory control, a notoriously difficult problem in supply chain operations, and explore the shape of the loss function associated with the optimal policy. We modelled and solved the problem using reinforcement learning while having a loss landscape in favor of optimality.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScheduling and Optimization Algorithms · Reinforcement Learning in Robotics · Stock Market Forecasting Methods