Understanding System Characteristics of Online Erasure Coding on Scalable, Distributed and Large-Scale SSD Array Systems
Sungjoon Koh, Jie Zhang, Miryeong Kwon, Jungyeon Yoon, David Donofrio,, Namsung Kim, Myoungsoo Jung

TL;DR
This paper investigates the effects of erasure coding, specifically Reed-Solomon codes, on the performance and system behavior of large-scale SSD array systems, providing detailed analysis and real trace data.
Contribution
It offers a comprehensive evaluation of erasure coding impacts on SSD arrays, including performance, overheads, and network traffic, with real-world trace data for further research.
Findings
Erasure coding reduces storage costs compared to replication.
RS coding impacts I/O performance and network traffic.
Physical data layout significantly affects RS-coded SSD array performance.
Abstract
Large-scale systems with arrays of solid state disks (SSDs) have become increasingly common in many computing segments. To make such systems resilient, we can adopt erasure coding such as Reed-Solomon (RS) code as an alternative to replication because erasure coding can offer a significantly lower storage cost than replication. To understand the impact of using erasure coding on system performance and other system aspects such as CPU utilization and network traffic, we build a storage cluster consisting of approximately one hundred processor cores with more than fifty high-performance SSDs, and evaluate the cluster with a popular open-source distributed parallel file system, Ceph. Then we analyze behaviors of systems adopting erasure coding from the following five viewpoints, compared with those of systems using replication: (1) storage system I/O performance; (2) computing and software…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
