ORL: Reinforcement Learning Benchmarks for Online Stochastic   Optimization Problems

Bharathan Balaji; Jordan Bell-Masterson; Enes Bilgin; Andreas; Damianou; Pablo Moreno Garcia; Arpit Jain; Runfei Luo; Alvaro Maggiar,; Balakrishnan Narayanaswamy; Chun Ye

arXiv:1911.10641·cs.LG·December 3, 2019·22 cites

ORL: Reinforcement Learning Benchmarks for Online Stochastic Optimization Problems

Bharathan Balaji, Jordan Bell-Masterson, Enes Bilgin, Andreas, Damianou, Pablo Moreno Garcia, Arpit Jain, Runfei Luo, Alvaro Maggiar,, Balakrishnan Narayanaswamy, Chun Ye

PDF

Open Access 2 Repos

TL;DR

This paper introduces RL benchmarks for online stochastic optimization problems like Bin Packing, Newsvendor, and Vehicle Routing, demonstrating RL's competitive performance and potential in practical dynamic resource allocation tasks.

Contribution

It establishes standardized benchmarks for applying RL to canonical online stochastic optimization problems, enabling rigorous comparison and advancing practical applications.

Findings

01

RL algorithms perform competitively or better than traditional baselines.

02

RL approaches require minimal domain knowledge.

03

Benchmarks facilitate comparison and future research in RL for optimization.

Abstract

Reinforcement Learning (RL) has achieved state-of-the-art results in domains such as robotics and games. We build on this previous work by applying RL algorithms to a selection of canonical online stochastic optimization problems with a range of practical applications: Bin Packing, Newsvendor, and Vehicle Routing. While there is a nascent literature that applies RL to these problems, there are no commonly accepted benchmarks which can be used to compare proposed approaches rigorously in terms of performance, scale, or generalizability. This paper aims to fill that gap. For each problem we apply both standard approaches as well as newer RL algorithms and analyze results. In each case, the performance of the trained RL policy is competitive with or superior to the corresponding baselines, while not requiring much in the way of domain knowledge. This highlights the potential of RL in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOptimization and Search Problems · Scheduling and Optimization Algorithms · Reinforcement Learning in Robotics