The Design, Implementation, and Deployment of a System to Transparently Compress Hundreds of Petabytes of Image Files for a File-Storage Service
Daniel Reiter Horn, Ken Elkabany, Chris Lesniewski-Laas, Keith, Winstein

TL;DR
This paper presents Lepton, a fault-tolerant system that losslessly compresses JPEG images significantly faster than previous methods, enabling large-scale deployment in Dropbox's storage infrastructure.
Contribution
Lepton introduces a parallelized arithmetic coding approach to JPEG compression, achieving faster decoding while maintaining high compression efficiency.
Findings
Compressed over 203 PiB of images, saving 46 PiB of storage.
Decodes more than nine times faster than prior methods.
Deployed in Dropbox for over a year with successful large-scale operation.
Abstract
We report the design, implementation, and deployment of Lepton, a fault-tolerant system that losslessly compresses JPEG images to 77% of their original size on average. Lepton replaces the lowest layer of baseline JPEG compression-a Huffman code-with a parallelized arithmetic code, so that the exact bytes of the original JPEG file can be recovered quickly. Lepton matches the compression efficiency of the best prior work, while decoding more than nine times faster and in a streaming manner. Lepton has been released as open-source software and has been deployed for a year on the Dropbox file-storage backend. As of February 2017, it had compressed more than 203 PiB of user JPEG files, saving more than 46 PiB.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed and Parallel Computing Systems · Advanced Image and Video Retrieval Techniques
