SOLO: Search Online, Learn Offline for Combinatorial Optimization   Problems

Joel Oren; Chana Ross; Maksym Lefarov; Felix Richter; Ayal Taitler,; Zohar Feldman; Christian Daniel; Dotan Di Castro

arXiv:2104.01646·cs.LG·May 19, 2021

SOLO: Search Online, Learn Offline for Combinatorial Optimization Problems

Joel Oren, Chana Ross, Maksym Lefarov, Felix Richter, Ayal Taitler,, Zohar Feldman, Christian Daniel, Dotan Di Castro

PDF

TL;DR

This paper introduces SOLO, a hybrid RL and planning approach for combinatorial problems like scheduling and routing, capable of handling both offline and online variants with improved efficiency and robustness.

Contribution

The paper presents a generic, scalable method combining Deep Q-Learning and Monte Carlo Tree Search for online and offline combinatorial optimization.

Findings

01

Outperforms traditional solvers and heuristics in speed and quality

02

Effective in online settings with dynamic problem components

03

Improves robustness of learned policies with search algorithms

Abstract

We study combinatorial problems with real world applications such as machine scheduling, routing, and assignment. We propose a method that combines Reinforcement Learning (RL) and planning. This method can equally be applied to both the offline, as well as online, variants of the combinatorial problem, in which the problem components (e.g., jobs in scheduling problems) are not known in advance, but rather arrive during the decision-making process. Our solution is quite generic, scalable, and leverages distributional knowledge of the problem parameters. We frame the solution process as an MDP, and take a Deep Q-Learning approach wherein states are represented as graphs, thereby allowing our trained policies to deal with arbitrary changes in a principled manner. Though learned policies work well in expectation, small deviations can have substantial negative effects in combinatorial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsQ-Learning