Billion-files File Systems (BfFS): A Comparison
Sohail Shaikh

TL;DR
This paper compares the performance and limitations of popular Linux filesystems by creating and analyzing one billion files, providing insights into throughput, storage efficiency, and degradation effects.
Contribution
It offers a comprehensive empirical comparison of five Linux filesystems at an unprecedented scale of one billion files, highlighting their performance characteristics and limitations.
Findings
XFS and BtrFS show higher throughput for large-scale file operations.
Filesystem performance degrades significantly after creating one billion files.
Storage efficiency varies across filesystems, affecting space utilization.
Abstract
As the volume of data being produced is increasing at an exponential rate that needs to be processed quickly, it is reasonable that the data needs to be available very close to the compute devices to reduce transfer latency. Due to this need, local filesystems are getting close attention to understand their inner workings, performance, and more importantly their limitations. This study analyzes few popular Linux filesystems: EXT4, XFS, BtrFS, ZFS, and F2FS by creating, storing, and then reading back one billion files from the local filesystem. The study also captured and analyzed read/write throughput, storage blocks usage, disk space utilization and overheads, and other metrics useful for system designers and integrators. Furthermore, the study explored other side effects such as filesystem performance degradation during and after these large numbers of files and folders are created.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies
