# Delay Comparison of Delivery and Coding Policies in Data Clusters

**Authors:** Virag Shah, Anne Bouillard, Francois Baccelli

arXiv: 1706.02384 · 2017-06-12

## TL;DR

This paper develops a unified model to analyze and compare delays in cloud data delivery policies, revealing how workload-aware strategies outperform others and how delays scale with request size.

## Contribution

It introduces evolution equations for modeling delivery policies, enabling analysis of delay performance and comparison under various statistical assumptions.

## Key findings

- Workload-aware policies perform better than workload-agnostic ones.
- Delays increase sub-logarithmically for small/medium files and linearly for large files.
- The model provides computable bounds for delay performance.

## Abstract

A key function of cloud infrastructure is to store and deliver diverse files, e.g., scientific datasets, social network information, videos, etc. In such systems, for the purpose of fast and reliable delivery, files are divided into chunks, replicated or erasure-coded, and disseminated across servers. It is neither known in general how delays scale with the size of a request nor how delays compare under different policies for coding, data dissemination, and delivery.   Motivated by these questions, we develop and explore a set of evolution equations as a unified model which captures the above features. These equations allow for both efficient simulation and mathematical analysis of several delivery policies under general statistical assumptions. In particular, we quantify in what sense a workload aware delivery policy performs better than a workload agnostic policy. Under a dynamic or stochastic setting, the sample path comparison of these policies does not hold in general. The comparison is shown to hold under the weaker increasing convex stochastic ordering, still stronger than the comparison of averages.   This result further allows us to obtain insightful computable performance bounds. For example, we show that in a system where files are divided into chunks of equal size, replicated or erasure-coded, and disseminated across servers at random, the job delays increase sub-logarithmically in the request size for small and medium-sized files but linearly for large files.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.02384/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1706.02384/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1706.02384/full.md

---
Source: https://tomesphere.com/paper/1706.02384