Finding Synchronization Codes to Boost Compression by Substring   Enumeration

Dany Vohl; Claude-Guy Quimper; Danny Dub\'e

arXiv:1605.08102·cs.IT·May 27, 2016·2 cites

Finding Synchronization Codes to Boost Compression by Substring Enumeration

Dany Vohl, Claude-Guy Quimper, Danny Dub\'e

PDF

Open Access

TL;DR

This paper introduces two constraint models to find the shortest synchronization codes that enhance the performance of Compression by Substring Enumeration (CSE) by adding minimal synchronization bits.

Contribution

It presents novel constraint models for computing minimal synchronization codes for blocks up to 64 bits, improving CSE compression effectiveness.

Findings

01

Successfully computed shortest synchronization codes for blocks up to 64 bits.

02

Demonstrated that inserting synchronization codes improves CSE compression performance.

03

Provided a new approach to optimize synchronization code length for bit-oriented compression schemes.

Abstract

Synchronization codes are frequently used in numerical data transmission and storage. Compression by Substring Enumeration (CSE) is a new lossless compression scheme that has turned into a new and unusual application for synchronization codes. CSE is an inherently bit-oriented technique. However, since the usual benchmark files are all byte-oriented, CSE incurred a penalty due to a problem called phase unawareness. Subsequent work showed that inserting a synchronization code inside the data before compressing it improves the compression performance. In this paper, we present two constraint models that compute the shortest synchronization codes, i.e. those that add the fewest synchronization bits to the original data. We find synchronization codes for blocks of up to 64 bits.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Cellular Automata and Applications · Advanced Data Storage Technologies