Safe Learning for Near Optimal Scheduling

Damien Busatto-Gaston; Debraj Chakraborty; Shibashis Guha; Guillermo; A. P\'erez; Jean-Fran\c{c}ois Raskin

arXiv:2005.09253·cs.AI·July 14, 2021

Safe Learning for Near Optimal Scheduling

Damien Busatto-Gaston, Debraj Chakraborty, Shibashis Guha, Guillermo, A. P\'erez, Jean-Fran\c{c}ois Raskin

PDF

Open Access

TL;DR

This paper presents a novel approach combining synthesis, model-based learning, and online sampling to develop safe, near-optimal schedulers for large preemptible task systems, overcoming limitations of existing model checkers.

Contribution

It introduces algorithms with PAC guarantees and extends Monte-Carlo tree search with safety advice, enabling safe exploration of large MDPs beyond current capabilities.

Findings

01

Algorithms outperform shielded deep Q-learning on large systems

02

Provides PAC guarantees for model learning

03

Handles MDPs with over 1020 states

Abstract

In this paper, we investigate the combination of synthesis, model-based learning, and online sampling techniques to obtain safe and near-optimal schedulers for a preemptible task scheduling problem. Our algorithms can handle Markov decision processes (MDPs) that have 1020 states and beyond which cannot be handled with state-of-the art probabilistic model-checkers. We provide probably approximately correct (PAC) guarantees for learning the model. Additionally, we extend Monte-Carlo tree search with advice, computed using safety games or obtained using the earliest-deadline-first scheduler, to safely explore the learned model online. Finally, we implemented and compared our algorithms empirically against shielded deep Q-learning on large task systems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReal-Time Systems Scheduling · Software Reliability and Analysis Research · Age of Information Optimization