Dependability Evaluation of Middleware Technology for Large-scale Distributed Caching
Domenico Cotroneo, Roberto Natella, Stefano Rosiello

TL;DR
This paper evaluates the dependability of three popular distributed caching middleware platforms, analyzing their availability and performance under various fault conditions to identify trade-offs and failure scenarios.
Contribution
It provides a comparative dependability analysis of Twemproxy, Mcrouter, and Dynomite, highlighting their strengths, weaknesses, and failure propagation risks.
Findings
Different availability and performance trade-offs among platforms
Identification of cascading failure scenarios
Impact of node failures and network congestion on system reliability
Abstract
Distributed caching systems (e.g., Memcached) are widely used by service providers to satisfy accesses by millions of concurrent clients. Given their large-scale, modern distributed systems rely on a middleware layer to manage caching nodes, to make applications easier to develop, and to apply load balancing and replication strategies. In this work, we performed a dependability evaluation of three popular middleware platforms, namely Twemproxy by Twitter, Mcrouter by Facebook, and Dynomite by Netflix, to assess availability and performance under faults, including failures of Memcached nodes and congestion due to unbalanced workloads and network link bandwidth bottlenecks. We point out the different availability and performance trade-offs achieved by the three platforms, and scenarios in which few faulty components cause cascading failures of the whole distributed system.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
