Improving Raw Image Storage Efficiency by Exploiting Similarity
Binqi Zhang, Chen Wang, Bing Bing Zhou, Albert Y. Zomaya

TL;DR
This paper explores how leveraging content-based similarity among raw images can enhance compression efficiency and facilitate faster data retrieval in large-scale storage systems.
Contribution
It demonstrates that using photo tags and local features to identify similar images improves compression and data management in distributed storage environments.
Findings
Higher image similarity correlates with better compression results.
Storing similar images together reduces fragmentation and speeds up retrieval.
Content-based similarity measures can guide storage optimization.
Abstract
To improve the temporal and spatial storage efficiency, researchers have intensively studied various techniques, including compression and deduplication. Through our evaluation, we find that methods such as photo tags or local features help to identify the content-based similar- ity between raw images. The images can then be com- pressed more efficiently to get better storage space sav- ings. Furthermore, storing similar raw images together enables rapid data sorting, searching and retrieval if the images are stored in a distributed and large-scale envi- ronment by reducing fragmentation. In this paper, we evaluated the compressibility by designing experiments and observing the results. We found that on a statistical basis the higher similarity photos have, the better com- pression results are. This research helps provide a clue for future large-scale storage system design.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Advanced Image and Video Retrieval Techniques · Cloud Data Security Solutions
