# Metareasoning in Modular Software Systems: On-the-Fly Configuration   using Reinforcement Learning with Rich Contextual Representations

**Authors:** Aditya Modi, Debadeepta Dey, Alekh Agarwal, Adith Swaminathan, Besmira, Nushi, Sean Andrist, Eric Horvitz

arXiv: 1905.05179 · 2019-05-15

## TL;DR

This paper introduces a reinforcement learning-based metareasoning approach for dynamic, on-the-fly configuration of modular systems, significantly improving overall utility in high-stakes, time-critical applications.

## Contribution

It presents novel metareasoning techniques that use rich contextual representations to optimize module configurations dynamically, addressing the combinatorial challenge of system-wide optimization.

## Key findings

- Significant performance improvements in real-world pipelines
- Effective adaptation across diverse reinforcement learning methods
- Enhanced system utility through dynamic configuration adjustments

## Abstract

Assemblies of modular subsystems are being pressed into service to perform sensing, reasoning, and decision making in high-stakes, time-critical tasks in such areas as transportation, healthcare, and industrial automation. We address the opportunity to maximize the utility of an overall computing system by employing reinforcement learning to guide the configuration of the set of interacting modules that comprise the system. The challenge of doing system-wide optimization is a combinatorial problem. Local attempts to boost the performance of a specific module by modifying its configuration often leads to losses in overall utility of the system's performance as the distribution of inputs to downstream modules changes drastically. We present metareasoning techniques which consider a rich representation of the input, monitor the state of the entire pipeline, and adjust the configuration of modules on-the-fly so as to maximize the utility of a system's operation. We show significant improvement in both real-world and synthetic pipelines across a variety of reinforcement learning techniques.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.05179/full.md

## Figures

24 figures with captions in the complete paper: https://tomesphere.com/paper/1905.05179/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1905.05179/full.md

---
Source: https://tomesphere.com/paper/1905.05179