Text Compression using Abstract Numeration System on a Regular Language
Ryoma Sin'ya

TL;DR
This paper introduces an ANS-based text compression method leveraging regular languages, demonstrating its efficiency, computability of average compression ratio, and extension to block-based compression.
Contribution
It defines a novel ANS-based compression scheme, analyzes its average compression ratio, and extends it to block-based compression using factorial languages.
Findings
Average compression ratio is computable from the language
ANS-based compression operates in sublinear time
Extension to block-based compression with factorial languages
Abstract
An abstract numeration system (ANS) is a numeration system that provides a one-to-one correspondence between the natural numbers and a regular language. In this paper, we define an ANS-based compression as an extension of this correspondence. In addition, we show the following results: 1) an average compression ratio is computable from a language, 2) an ANS-based compression runs in sublinear time with respect to the length of the input string, and 3) an ANS-based compression can be extended to block-based compression using a factorial language.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicssemigroups and automata theory · Algorithms and Data Compression · Computability, Logic, AI Algorithms
