Two-Dimensional Source Coding by Means of Subblock Enumeration
Takahiro Ota, Hiroyoshi Morita

TL;DR
This paper extends the substring enumeration compression technique to two-dimensional sources like images by introducing a block-based approach using a flat torus model, reducing complexity and analyzing code length limits.
Contribution
It proposes a new 2D source coding method using block-by-block encoding with a flat torus model, improving efficiency over line-by-line methods.
Findings
Reduces encoding complexity for 2D sources
Uses flat torus as a probabilistic model
Analyzes average codeword length limits
Abstract
A technique of lossless compression via substring enumeration (CSE) attains compression ratios as well as popular lossless compressors for one-dimensional (1D) sources. The CSE utilizes a probabilistic model built from the circular string of an input source for encoding the source.The CSE is applicable to two-dimensional (2D) sources such as images by dealing with a line of pixels of 2D source as a symbol of an extended alphabet. At the initial step of the CSE encoding process, we need to output the number of occurrences of all symbols of the extended alphabet, so that the time complexity increase exponentially when the size of source becomes large. To reduce the time complexity, we propose a new CSE which can encode a 2D source in block-by-block instead of line-by-line. The proposed CSE utilizes the flat torus of an input 2D source as a probabilistic model for encoding the source…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Cellular Automata and Applications · DNA and Biological Computing
Two-Dimensional Source Coding by Means of Subblock Enumeration
Takahiro Ota
Dept. of Computer & Systems Engineering
Nagano Prefectural Institute of Technology
813-8, Shimonogo, Ueda, Nagano, 386-1211, JAPAN
Email: [email protected]
Hiroyoshi Morita
Graduate School of Informatics and Engineering
The University of Electro-Communications
1-5-1, Chofugaoka, Chofu, Tokyo, 182-8585, JAPAN
Email: [email protected]
Abstract
A technique of lossless compression via substring enumeration (CSE) attains compression ratios as well as popular lossless compressors for one-dimensional (1D) sources. The CSE utilizes a probabilistic model built from the circular string of an input source for encoding the source. The CSE is applicable to two-dimensional (2D) sources such as images by dealing with a line of pixels of 2D source as a symbol of an extended alphabet. At the initial step of the CSE encoding process, we need to output the number of occurrences of all symbols of the extended alphabet, so that the time complexity increase exponentially when the size of source becomes large. To reduce the time complexity, we propose a new CSE which can encode a 2D source in block-by-block instead of line-by-line. The proposed CSE utilizes the flat torus of an input 2D source as a probabilistic model for encoding the source instead of the circular string of the source. Moreover, we analyze the limit of the average codeword length of the proposed CSE for general sources.
I Introduction
In 2010, Dubé and Beaudoin proposed an efficient off-line data compression algorithm for a binary source known as Compression via Substring Enumeration (CSE) [1]. In [2], Yokoo proposed a universal CSE algorithm for a binary source and various versions of the CSE for a binary source have been proposed so far [3, 4, 5]. It is reported that performance of the CSE [4] is as well as that of an efficient off-line data compression algorithm using the Burrows-Wheeler transformation (BWT) [6]. In [7], it is proved that an encoder, which is a deterministic finite automaton, of the CSE and an encoder without sinks of the antidictionary coding [8] are isomorphic for a binary source. Moreover, an antidictionary coding proposed in [9] provided the first CSE for -ary () alphabet sources as a byproduct. Iwata and Arimura proposed the modified algorithm and evaluated the maximum redundancy rate of the CSE for the th order Markov sources [10].
For encoding an input source, the CSE utilizes a probabilistic model built from the circular string which is obtained by concatenating the first symbol to the last symbol of the source. A probabilistic model of the circular string is also useful for the BWT and antidictionary coding [7, 9], and in [11], it is shown that an antidictionary built from the circular string is useful for genome comparison such as deoxyribonucleic acid (DNA). However, for a 2D source such as an image, computational time of the CSE is exponential with respect to line length since the CSE works in line-by-line. The CSE deals with a line of 2D source as a symbol of an extended alphabet. At the initial step of the CSE encoding process, the CSE needs to output frequencies of all symbols of the extended alphabet.
To reduce the computational time, we propose a new CSE for a 2D source which utilizes the flat torus of an input 2D source as a probabilistic model instead of the circular string of the source. In the initial step, the total number of output blocks is constant since the new CSE works in block-by-block. Moreover, we evaluate the limit of the average codeword length of the proposed algorithm for general sources.
II Basic Notations and Definitions
II-A Alphabet and Block
Let be a finite source alphabet and let be a cardinality of , that is . Let be the set of all finite blocks over , where is the element of at -coordinate. Furthermore, let be where includes the empty block when at least one of and is [math]. For convenience, and are defined as and , respectively. For , let and be the length of row (the height) and the length of column (the width), respectively. For example, when , Fig. 2 illustrates where .
II-B Subblock, Concatenation, and Dictionary
For , a subblock is defined as
[TABLE]
where , , , and . Hereinafter, without notice, we assume that the height and width of are respectively given by and . In particular, subblocks and are denoted by and , respectively. Moreover, subblocks and are denoted by and , respectively. For example, for in Fig. 2, Fig. 2 shows , , , and from the left-hand side.
For , the dictionary of is defined as the set of all the subblocks of , that is,
[TABLE]
Now we define a concatenation of blocks by column-wisely as follows: For two blocks such that , define to be a block obtained by concatenating at the end of in columns. Similarly, we define a concatenation of blocks by row-wisely as follows: for two blocks such that , define to be a block obtained by concatenating at the end of in rows.
II-C Flat Torus, Primitive, and Frequencies of Subblocks
For , a flat torus of , denoted by , is constructed by concatenating the most left-hand side column (resp. the top row) to the most right-hand side column (resp. the bottom row) of . The flat torus can be treated as an infinite pattern such that for non-negative integer .
For and , if there exist positive integers and such that is satisfied, then the equivalence relation is denoted as . Note that is a subblock of . Let be the set of all the blocks such that ,
[TABLE]
If , is called primitive. Hereinafter, without notice, we assume that is primitive. For example, shown in Fig. 2 is primitive.
For and ( and ,
[TABLE]
where ( or ). For convenience, we often adopt the notation instead of . For , , and ,
[TABLE]
Moreover, for and ,
[TABLE]
II-D Classifications of Flat Tori and Core
For and , and ,
[TABLE]
For example, . For and fixed , is monotone decreasing with , that is . Similarly, for fixed and , . Next, we define ,
[TABLE]
We assume that elements of are ordered in ascending order with its height (if heights of the elements are equal, then the elements ordered with its width; if widths of the elements are equal, then the elements are ordered in lexicographical order column-wisely) where is the th element of . For ,
[TABLE]
For example, . For , is monotone decreasing with , that is .
A such that where is called c-core. A such that where is called r-core.
III Review of Conventional CSE
The conventional CSE is a lossless compression algorithm for a 1D source. For , we can regard as a 1D source over an extended alphabet , so that the CSE can encode as a 1D source . For , the CSE outputs a following triplet
[TABLE]
In (9), represents an encoded by means of Elias integer code [12]. And rank() represents an index for identifying in such as the rank of in with lexicographical order. Then, (rank()) represents an encoded rank() by bits, and represents a sequence of which are encoded by an entropy coding where represents in this subsection. In encoding, for , is selected from 2 to since and is encoded as . For ,
(C-i)
in case of : Encode if ,
(C-ii)
in case of : Encode if (10) holds and where such that
where is the element of having the largest index in and note that (10) was first shown in [10]. Note that in (C-i), is encoded even if .
In (C-i), can be calculated by using (3) and already encoded . Similarly, in (C-ii), such that or can be calculated by using (4) and . Therefore, they are not encoded.
[TABLE]
As for in (C-ii), satisfying (10) is the same that is a c-core. Moreover, since and (3) holds, number of candidates of for encoding in (C-ii) is polynomial order with . The details are described in the bottom of this section. In (C-i), satisfies the following inequality
[TABLE]
In (C-ii), satisfies the following inequality [9]
[TABLE]
The left-hand side term in (10) is given by the difference between the 3rd term and the 1st term in (12). Therefore, if (10) does not hold, then the 1st and the 3rd terms are equal. In other words, holds, so that can be calculated. Hence, is not encoded if (10) does not hold.
Let be where is the left-hand term of (10). For encoding by an entropy coding, a probability is assigned to as follows [2].
[TABLE]
The assigned probabilities are encoded by an entropy coding such as an arithmetic coding [13].
For encoding 2D source by the conventional CSE, there is a problem with respect to computational time. In (C-i), number of encoded is exponential with respect to since is . In practical, is greater than 1000 for an image , so that the number is greater than even if . Note that in (C-ii), number of encoded is not exponential with respect to and . The reason is as follows. Since is a c-core, from (3) and (4), the total number of c-cores is polynomial order with respect to and . Moreover, since and in (10), also hold. From (3) and (4), never exceeds . Hence, the total number of candidates for encoding in (C-ii) is polynomial order with respect to and . In other words, the set of all the candidates can be utilized instead of in (C-ii) in practice. Note that is utilized for simplifying the explanation in this paper. As for compression ratio, only a relation on column is utilized as shown in (10) and a relation on row is not utilized.
IV Proposed Algorithm
For , we assume that . Let and be and , respectively.
We divide into four disjoint parts with respect to size of its elements.
[TABLE]
Elements of are ordered in ascending order with its height (if heights of the elements are equal, then the elements ordered with its width; if widths of the elements are equal, then the elements are ordered in lexicographical column-wisely.) Then, elements of are reordered with . For ,
(P-i)
in case of : Encode if ,
(P-ii)
in case of :
** 1)**
if : Encode if (10) holds and where such that ,
** 2)**
if : Encode if (16) holds and where such that ,
** 3)**
if and : Encode if both (10) and (16) hold where and ,
where and are the element of and having the largest index in , respectively.
[TABLE]
As for in 2) and 3), satisfying (16) is the same that is a r-core. As shown in the discussions in Sec. III, number of candidates of for encoding in (P-ii) is polynomial order with and . The details are described in the bottom of this section.
The conventional CSE utilizes only condition (10) with respect to column, while the proposed algorithm utilizes conditions (10) and (16) with respect to column and row, respectively, for encoding . In 1) and 2), is one row and one column, so that (10) and (16) is only utilized, respectively. In (P-i), satisfies . In (P-ii), such that satisfies a modified (12) which is obtained by replacing by , and such that satisfies the following inequality
[TABLE]
As described on (10), similarly, the left-hand side term in (16) is given by the difference between the 3rd term and the 1st term in (17). Therefore, if (16) does not hold, then the 1st and the 3rd terms are equal. In other words, holds, so that can be calculated. Hence, is not encoded if (16) does not hold. Therefore, in 3), is encoded if both (10) and (16) hold.
Let be where is the left-hand term of (16). For encoding by an entropy coding, a probability is assigned to as follows.
[TABLE]
The assigned probabilities are encoded by an entropy coding such as an arithmetic coding. For , the proposed algorithm outputs a following quartet
[TABLE]
In (21), and represent encoded and by means of Elias integer code, respectively. And rank() represents an index for identifying in such as the rank of in with lexicographical order column-wisely. Then, (rank()) represents an encoded rank() by bits, and represents a sequence of which are encoded by an entropy coding as described in Sec III.
In the proposed algorithm, in (P-i), number of encoded is , that is a constant, while that in (C-i) is exponential with respect to , that is . As for (P-ii), number of candidates for encoding is polynomial order with respect to and . The reason is as follows. As for 1), it is the same as (C-ii). As for 2) and 3), since is a r-core, from the discussions on a c-core described in Sec. III, the total number of candidates for encoding is polynomial order with and . In other words, the set of all the candidates can be utilized instead of in (P-ii) in practice. Similarly, note that is utilized for simplifying the explanation in this paper. Hence, for a 2D source , the total number of output blocks of the proposed algorithm is polynomial with respect to and while that of the conventional CSE is exponential with respect to .
V Evaluation of the Proposed Algorithm
A general source is defined as
[TABLE]
where a random variable takes a value in the Cartesian product of [14]. The probability distribution of a random variable is denoted by . For , the sup-entropy rate of is defined as
[TABLE]
For , let be a codeword length of the proposed algorithm. Let be the total codeword length of , , and (rank()) in (21). The codeword length of consists of three parts , , and where , , and are the total codeword length of for , , and , respectively. Here, .
Theorem 1 is one of our main results. To prove Theorem 1, we show three lemmas. Lemma 2 is a 2D version of Lemma 3 [2], and the proofs of Lemmas 2 and 3 are omitted in this paper.
Theorem 1
For a general source ,
[TABLE]
Lemma 2
For , , and
[TABLE]
Lemma 3
If such that does not satisfy (10) or such that does not satisfy (16), then
Lemma 4
[TABLE]
Proof.
For , can be written by
[TABLE]
where and are and , respectively, and is a coordinate. For , let be . Moreover, can be written by where from (2). Since and are respectively and , converges to as and go to infinity. Since ,
[TABLE]
∎
(Proof of Theorem 1).
As for , from the assumption, since , where and are costs of Elias integer code for and (rank()), respectively. As for , the cost of in (P-i) is bits from (18), so that . As for , since and , costs of and are at most bits. Moreover, since and ,
[TABLE]
Therefore,
[TABLE]
As for , from (20), cost of is bits.
Cost of the next encoded such that has been encoded immediately before is . From Lemma 3, . Therefore, can be written by , Hence, the denominator for is equal to the previous numerator for , so that they are canceled. Moreover, since ,
[TABLE]
where is the index of the first block which is encoded by arithmetic coding. From Lemma 3, . Therefore,
[TABLE]
[TABLE]
Therefore,
[TABLE]
From Jensen’s inequality, . Therefore, from Lemma 4,
[TABLE]
[TABLE]
The proposed code is a prefix code, so that Kraft’s inequality is satisfied. Therefore, . ∎
From Remark 1.7.3 [14], if is a stationary source, can be expressed by , that is the entropy rate of . Therefore, if is a stationary source, the average codeword length of the proposed algorithm converges to as and go to infinity.
VI Conclusion
For reducing computational time, we proposed a new CSE for a 2D source which utilizes the flat torus of the source while the conventional CSE utilizes the circular string of the source as a probabilistic model. The total number of output blocks of the new CSE is polynomial while that of the conventional CSE is exponential with respect to the source size. The new CSE encodes the source in block-by-block while the conventional CSE does in line-by-line. Moreover, we prove that an upper bound on the average codeword length of the proposed CSE converges to the sup-entropy rate for a general source as size of the input source goes to infinity. Furthermore, if a general source is a stationary source, then the length converges to the entropy rate of the source as the size goes to infinity.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] D. Dubé and V. Beaudoin, “Lossless data compression via substring enumeration,” in Proc. of the Data Compression Conference 2010 , pp. 229–238, Mar. 2010.
- 2[2] H. Yokoo, “Asymptotic optimal lossless compression via the cse technique,” in Proc. of the Data Compression, Communications and Processing 2011 , pp. 11–18, June 2011.
- 3[3] D. Dubé and H. Yokoo, “The universality and linearity of compression by substring enumeration,” in Proc. of the 2012 IEEE International Symposium on Information Theory , pp. 1619–1623, Aug. 2011.
- 4[4] D. Dubé and V. Beaudoin, “Improving compression via substring enumeration by explicit phase awareness,” in Proc. of the Data Compression Conference 2014 , pp. 26–28, Mar. 2014.
- 5[5] S. Kanai, H. Yokoo, K. Yamazaki, and H. Kaneyasu, “Efficient implementation and empirical evaluation of compression by substring enumeration,” IEICE Transactions on Fundamentals , vol. E 99-A, no. 2, pp. 601–611, 2016.
- 6[6] M. Burrows and D. Wheeler, “A block-sorting lossless data compression algorithm,” SRC Research Report , pp. 73–93, May 1994.
- 7[7] T. Ota and H. Morita, “On antidictionary coding based on compacted substring automaton,” in Proc. of the 2013 IEEE International Symposium on Information Theory , pp. 1754–1758, July 2013.
- 8[8] M. Crochemore, F. Mignosi, A. Restivo, and S. Salemi, “Data compression using antidictionaries,” in Proc. of IEEE , pp. 1756–1768, Nov. 2000.
