# Runtime QoS service for application-driven adaptation in network   computing

**Authors:** Feras Al-Hawari, Elias Manolakos

arXiv: 1907.12986 · 2019-07-31

## TL;DR

This paper presents a runtime QoS service with lightweight middleware for application-driven adaptation in networked environments, enabling performance and fault tolerance with minimal overhead.

## Contribution

It introduces a QoS middleware and API that facilitate dynamic adaptation and fault tolerance in distributed applications on a Network of Workstations.

## Key findings

- The QoS middleware has minor performance impact.
- The API enables effective fault tolerance and scheduling.
- Application adaptation improves resilience and efficiency.

## Abstract

A distributed application executing on a Network of Workstations (NOW) needs to be resource state aware to possibly adapt itself accordingly in order to keep satisfying the desired Quality of Service (QoS) demands throughout its lifespan. We implemented a QoS service to enable application-driven adaptation for performance and fault tolerance at runtime. The service is associated with lightweight middleware that monitors the state and load of all application entities (e.g., machines, tasks, and logical network links). Moreover, it makes its services available to an application task via an anonymous and simple to use QoS API. We present a Manager-Worker application that uses our fault tolerance QoS API to adapt for Worker faults in order to avoid application deadlock at runtime. Moreover, we show how a dynamic application-level scheduler can easily utilize the QoS API to find efficient schedules. Furthermore, we quantified the overhead of the QoS middleware in various scenarios to demonstrate that it has minor impact on the performance of the application it is servicing.

---
Source: https://tomesphere.com/paper/1907.12986