Deep Q-Learning versus Proximal Policy Optimization: Performance   Comparison in a Material Sorting Task

Reuf Kozlica; Stefan Wegenkittl; Simon Hirl\"ander

arXiv:2306.01451·cs.AI·June 5, 2023·1 cites

Deep Q-Learning versus Proximal Policy Optimization: Performance Comparison in a Material Sorting Task

Reuf Kozlica, Stefan Wegenkittl, Simon Hirl\"ander

PDF

Open Access

TL;DR

This study compares Deep Q-Learning and Proximal Policy Optimization in a simulated material sorting task, showing PPO's superior performance across multiple metrics in a production system context.

Contribution

It provides a comparative analysis of DQN and PPO in a production environment using a Petri Net simulation, highlighting PPO's advantages in high-dimensional spaces.

Findings

01

PPO outperforms DQN in accuracy, episode length, and success rate.

02

Policy-based algorithms are more effective in high-dimensional state and action spaces.

03

The study offers insights into algorithm suitability for production system tasks.

Abstract

This paper presents a comparison between two well-known deep Reinforcement Learning (RL) algorithms: Deep Q-Learning (DQN) and Proximal Policy Optimization (PPO) in a simulated production system. We utilize a Petri Net (PN)-based simulation environment, which was previously proposed in related work. The performance of the two algorithms is compared based on several evaluation metrics, including average percentage of correctly assembled and sorted products, average episode length, and percentage of successful episodes. The results show that PPO outperforms DQN in terms of all evaluation metrics. The study highlights the advantages of policy-based algorithms in problems with high-dimensional state and action spaces. The study contributes to the field of deep RL in context of production systems by providing insights into the effectiveness of different algorithms and their suitability for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBusiness Process Modeling and Analysis · Flexible and Reconfigurable Manufacturing Systems · Scheduling and Optimization Algorithms

MethodsQ-Learning · Entropy Regularization · Convolution · Dense Connections · Deep Q-Network · Proximal Policy Optimization