Pangolin: A Fault-Tolerant Persistent Memory Programming Library
Lu Zhang, Steven Swanson

TL;DR
Pangolin is a fault-tolerant library for non-volatile memory that ensures data integrity with minimal overhead, enabling reliable persistent data structures with high performance.
Contribution
It introduces a novel fault-tolerance mechanism combining checksums, parity, and micro-buffering for NVMM, supporting objects of any size with low overhead and online error detection.
Findings
Achieves comparable performance to existing solutions
Uses only 1% storage overhead for gigabyte-scale pools
Provides automatic detection and recovery from data corruption
Abstract
Non-volatile main memory (NVMM) allows programmers to build complex, persistent, pointer-based data structures that can offer substantial performance gains over conventional approaches to managing persistent state. This programming model removes the file system from the critical path which improves performance, but it also places these data structures out of reach of file system-based fault tolerance mechanisms (e.g., block-based checksums or erasure coding). Without fault-tolerance, using NVMM to hold critical data will be much less attractive. This paper presents Pangolin, a fault-tolerant persistent object library designed for NVMM. Pangolin uses a combination of checksums, parity, and micro-buffering to protect an application's objects from both media errors and corruption due to software bugs. It provides these protections for objects of any size and supports automatic, online…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Parallel Computing and Optimization Techniques · Distributed systems and fault tolerance
