The meet operation in the imbalance lattice of maximal instantaneous codes: alternative proof of existence
Stephan Foldes, D. Stott Parker, Sandor Radeleczki

TL;DR
This paper presents an alternative proof for the existence of greatest lower bounds in the imbalance order of binary maximal instantaneous codes, using a novel balancing operation instead of traditional methods.
Contribution
It introduces a new proof technique for the existence of greatest lower bounds in the imbalance lattice of maximal instantaneous codes.
Findings
Proof of existence of greatest lower bounds using a single balancing operation
Simplifies previous proofs by avoiding expansion and contraction
Provides a new perspective on the structure of the imbalance lattice
Abstract
An alternative proof is given of the existence of greatest lower bounds in the imbalance order of binary maximal instantaneous codes of a given size. These codes are viewed as maximal antichains of a given size in the infinite binary tree of 0-1 words. The proof proposed makes use of a single balancing operation instead of expansion and contraction as in the original proof of the existence of glb.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
The meet operation in the imbalance lattice of maximal instantaneous codes: alternative proof of existence00footnotetext: Keywords and phrases: Instantaneous code, prefix code, path-length sequence, imbalance lattice, balancing operation, Kraft sum, binary tree, canonical tree 00footnotetext: AMS Classification 2010: Primary: 06A07, 94A45
Stephan Foldes, D. Stott Parker, Sándor Radeleczki
**Abstract **
An alternative proof is given of the existence of greatest lower bounds in the imbalance order of binary maximal instantaneous codes of a given size. These codes are viewed as maximal antichains of a given size in the infinite binary tree of 0-1 words. The proof proposed makes use of a single balancing operation instead of expansion and contraction as in the original proof of the existence of glb.
1 ** Terminology of codes and introduction**
The set of all finite sequences (words) of the symbols [math] and is partially ordered by the prefix order defined by
[TABLE]
The prefix-ordered set of words is an infinite binary tree having the empty word as root. Instantaneous codes are defined as the finite antichains in this tree. (This finiteness shall be assumed throughout the paper, thus excluding infinite prefix-free sets.) By lexicographic (lex) order we mean the (only) linear extension of the prefix order in which words incomparable in the prefix order are compared by the ”telephone book principle”, i.e. always precedes (is smaller than) . We call an instantaneous code lex monotone if the sequence of lengths of the codewords taken in lex order is monotone (non-decreasing). With respect to a given lex monotone instantaneous code whose codewords in lex order are , each codeword is then identified by its index .
It is well known that for every maximal instantaneous code there is a unique lex monotone maximal instantaneous code with the same multiset of codeword lengths. (See e.g. for more general statements.) Thus the multiset of codeword lengths, displayed as a list of numbers with repetitions in non-decreasing order, can be used to denote the maximal instantaneous lex monotone code, these are the path-length sequences appearing in . For example, the only lex monotone maximal instantaneous code of size is , and its path-length sequence is . The code can also be displayed by the binary tree of all prefixes of the codewords (called a *canonical tree *by Elsholtz, Heuberger and Prodinger in under assumption of lex monotonicity), and the path-length sequence is then the sequence of lengths of root-to-leaf paths of this tree.
For any word we can use the simplified notation
[TABLE]
to denote the number obtained by raising to a power equal to the length of . The Kraft sum of any instantaneous code is then the sum
[TABLE]
The Kraft sum is always at most , and it is equal to if and only if the instantaneous code is maximal (Kraft ).
2 Statements and proofs
With the above terminology and notation we rephrase the definition of imbalance order and the result that it is a lattice (Stott Parker and Prasad Ram ) as follows.
Definition of Imbalance Order by Majorization Let be the set of lex monotone maximal instantaneous codes of a same given size. For codes lexicographically enumerated
[TABLE]
is said to be more balanced (less imbalanced) than or equal to ,* in symbols* , if for all we have the following inequality for the partial Kraft sums:
[TABLE]
A characterization of the imbalance order via ternary exchanges was also given in and in the sequel we shall give another characterization by comparing indices in the enumerations (1).
The size of the poset of lex monotone maximal instantaneous codes of size is an exponentially growing function of the parameter , which has been the object of combinatorial studies since the 1960’s (see , where further references are also given). While a closed formula for is not available, Elsholtz, Heuberger and Prodinger gave a new and very tight asymptotic estimate of .
Lattice Property of Imbalance Order The imbalance-ordered set of lex monotone maximal instantaneous codes of the same given size is a lattice.
Due to the lattice property the construction of optimal codes becomes an optimization problem on a lattice. In a context very different from that of binary codes, the balance concept introduced in has also been shown by O’Keefe, Pajoohesh and Schellekens to be relevant in studying the efficiency of algorithms that involve a bifurcation at each step, as a root-to-leaf path in the decision tree of an algorithm corresponds to the succession of steps of the algorithm on a particular input, and path-length corresponds to running time on that input. Besides pointing to the analogy between the concepts of average codeword length and average running time, also shows that, with the exception of the very small lattices, the imbalance lattices are not modular. Pajoohesh characterizes the most balanced and the most imbalanced trees in terms of the semilattice structure of the trees themselves.
In the present paper an alternative proof of the above lattice property is given, based not on induction on the common size of the codes in the imbalance-ordered set of codes, but on applying the abstract Criterion below for a poset to be a lattice. An earlier alternative explanation of the lattice property, in fact closer to the techniques of the original proof of the result in was given by two of the present authors in .
**Criterion for Lattice Property ***For any finite partially ordered set with minimum and maximum the following conditions are equivalent: *
(i) the poset is a lattice,
(ii) for every pair of distinct elements , one of them - say - can be replaced by a lesser element , such that and have the same common lower bounds as and
(iii) for every pair of distinct elements there is an element that is less than or , and such that have the same common lower bounds as .
Proof Condition (ii) obviously holds in any lattice, while the existence of a greatest lower bound of and is obtained, using condition (ii), by induction on the number of elements that are below at least one of and . Condition (iii) is a re-phrasing of (ii).
The characterization of the imbalance order given below, by comparing indices, and a reduction lemma, shall make the above Criterion applicable. The characterization is based on the following description of the comparabilities between elements of any two finite maximal instantaneous codes:
**Interval Decomposition Lemma for Two Codes **For any two maximal instantaneous codes and there is a unique positive integer and unique partitions of the lexicographically ordered codes into pairwise disjoint non-empty intervals consecutive in the lexicographic order
[TABLE]
such that any words and are comparable in the prefix order if and only if
**Proof **Two elements of belong to the same interval if and only if there is some element of comparable with both. The intervals of are defined similarly. The fact that this defines interval decompositions of the two codes with the same number of intervals is verified without difficulty. The claimed properties and uniqueness are also straightforward.
The interval decompositions (2) also have the following properties:
(i) for every , at least one of or is a singleton, both are singletons if and only if , otherwise they are disjoint,
(ii) if the interval (respectively ) is a singleton, then its unique element is a prefix of the words in (respectively in ),
(iii) for every we have the equality of the corresponding interval Kraft sums, ,
(iv) if , , , then and are incomparable in the prefix order and precedes lexicographically.
With a view of referring to these interval decompositions in the sequel, we call the intervals (respectively ) in (2) the (comparability) blocks of with respect to (of with respect to ). A block (respectively ) is said to be dominating if it is a singleton but (respectively ) is not. In that case the sole element of (respectively of ) is a proper prefix of every word in (respectively in ). Note that if and are not coinciding singletons, then exactly one of them is a dominating block.
**Characterization of the Imbalance Order by Comparing Indices ***For lexicographically enumerated maximal instantaneous codes *
[TABLE]
of the same size, we have in the imbalance order ( is more balanced than or equal to ) if and only if whenever and are comparable codewords in the prefix order, for their indices we have
Proof Suppose that and for some codewords* and * comparable in the prefix order we have* * We shall derive a contradiction. Consider the interval decompositions for the two codes, as in (2). Due to the comparability of the two codewords, they belong to corresponding intervals, i.e. there is an index such that and . If is a singleton, , then
[TABLE]
contradicts majorization. If is a singleton, , then majorization is contradicted by
[TABLE]
Suppose conversely that whenever and* * are comparable codewords in the prefix order, for their indices we have* * but majorization fails for some index ,
[TABLE]
Let the indices be determined by . If is a singleton, , then (3) requires that the lex last word in lexicographically follow Then contradicts the comparability of and . If is a singleton, , then (3) requires that lexicographically follow all words in . Let be any word in . Now contradicts the comparability of and .
**Reduction Lemma **If are two distinct lexicographically monotone maximal instantaneous codes of the same size, then there is a lex monotone maximal instantaneous code that is (strictly) more balanced than at least one of or , and such that in the imbalance order have the same common lower bounds as .
**Proof **Let the given codes be enumerated in lex order as
[TABLE]
Since and are distinct, there must exist elements and such that
(i) is a proper prefix of some element of
(ii) is a proper prefix of some element of
Without loss of generality we can assume that the first such lexicographically precedes the first such . Let denote the index in of the lexicographically first element satisfying condition (ii). With thus fixed, let denote the index in of the lexicographically last element among those elements of which lexicographically precede and satisfy condition (i). Thus a word in has also been chosen, and it is easy to see that .
With reference to the terminology of decompositions according to the Interval Decomposition Lemma for Two Codes, is the sole element of the first dominating block of with respect to , and is the sole element of the last block of that is dominating and precedes all non-singleton blocks of .
The code is now constructed as follows. It is obtained from by a single balancing operation, in the sense of , chosen to take into account the relationship of with , and the choice of the codewords and . Referring to the indexed enumeration of in lex order appearing in (4), let be the first two among the elements of admitting as a prefix which have the same length. It is not difficult to see that and are twin sons (one letter extensions) of some word ,
Let . Obviously in the imbalance order .
We claim that if an arbitrary lex monotone maximal instantaneous code with lex enumerated codewords is more balanced than (or equal to) and then it is also more balanced than or equal to This will show that the statement of the Lemma holds.
The elements of , enumerated as a sequence of words in lex order as are partitioned into five consecutive subsequences:
(empty subsequence if )
,……,
(empty subsequence if , non-empty if )
,
(empty subsequence if is the common size of the codes)
The subsequence in turn consists of two (possibly empty) consecutive subsequences: the first of these consists of elements also belonging to , and the second consists of elements that have as a proper prefix. In the first subsequence the index in of any element is (strictly) larger than its index in . In the second subsequence the last symbol of each element is [math] (i.e. it is not of the form , for otherwise would have to be also in this second subsequence, contradicting the definition of ).
In view of the Characterization of the Imbalance Order by Comparing Indices, we need to verify that if some codeword in is comparable in the prefix order to a codeword in , i.e. to some having index in the lex enumeration of then the index of in is at least This is obvious for and , since . In the following examination of the remaining cases comparability will always refer to comparability in the prefix order of the tree of words.
For if an element of is comparable to , then it is also comparable to its prefix . The assumption then implies .
For , if an element of is comparable to , it is of course also comparable to , and we claim that it is comparable as well to some member of with index Note that both and must be the prefixes of words in , and no two of the codewords in can be prefixes of the same word in (while each one of is the prefix of at least one word in and is a prefix of at least two). From this we can conclude that must be the prefix of some in , and all such elements of have index Now it follows that the element of is comparable to at least one such with index . But then, as in the imbalance order, the index of in is at least .
In the interval if it is not empty, let be the smallest index such that for some the elements and are comparable
- we shall derive a contradiction. For thus fixed, let be as small as possible. If belongs to , then its index in is (strictly) greater than , thus by in the imbalance order it could not be comparable to . Therefore has as a proper prefix. Also its length is (strictly) larger than that of . This implies that the last symbol of must be [math]. But now, since the last symbol of is [math], if the word were a proper prefix of , then it would be a proper prefix of also, implying , which is contrary to assumption. Thus is a prefix of and is the sole element of comparable to Now is comparable to one or more elements of , and all such indices must be at least by the minimality assumption on But as cannot be comparable with , it must come later in the lex order on than all the elements of comparable with , i.e. for all such Therefore and thus is at least , a contradiction.
The argument is similar for . If the element of comparable with is comparable with then we are done. Else must have and as a prefix and cannot be comparable with . Therefore, and thus must come later in the lex order on than all the elements of comparable with , i.e. for all such . But we already know that the indices in of these latter elements are at least forcing
We have thus shown that in the imbalance order all common lower bounds of and are also lower bounds of the code constructed from these latter two, completing the proof of the Lemma and thus providing an alternative proof of the Lattice Property.
*Remark. *Repeated application of the construction of in the proof of the Reduction Lemma provides an algorithm for constructing the meet of any two codes and in the imbalance lattice. (The repetition is to be applied to the reduced pair of codes gotten by replacing or by , according to whether is more balanced than or ) As simple examples with incomparable and , we can take, using the lattice diagrams on p. 7. of with path-length sequence representation of codes of size
(where is in fact the meet of and )
or we can take as example of codes of size
(where is still less balanced than the meet of and ).
Acknowledgements.
Part of this work has been co-funded by Marie Curie Actions and supported by the National Development Agency (NDA) of Hungary and the Hungarian Scientific Research Fund (OTKA, contract number 84593), within a project hosted by the University of Miskolc, Department of Analysis.
References
C. Elsholtz, C. Heuberger, H. Prodinger, The number of Huffman codes, compact trees, and sums of unit fractions, *IEEE Trans. Information Theory, *59 (2) 2013, 1065-1075
S. Foldes, S. Radeleczki, On the imbalance lattice of path-length sequences of binary trees, ArXiv 2013 (https://arxiv.org/abs/1307.0161)
S. Foldes, N.M. Singhi, On instantaneous codes, Journal of Combinatorics, Information & System Sciences 31 (2006) 317–326
L.G. Kraft, A Device for Quantizing, Grouping, and Coding Amplitude Modulated Pulses, Q.S. Thesis, MIT 1949
M. O’Keefe, H. Pajoohesh, M. Schellekens, Decision trees of algorithms and a semivaluation to measure their distance, *Electr. Notes Comput. Sc. *161 (2006) 175-183
H. Pajoohesh, Topological and categirical properties of of binary trees, Applied Gen. Topology 9 (1) (2008) 1-14
D. Stott Parker, Prasad Ram, The Construction of Huffman Codes is a Submodular (”Convex”) Optimization Problem Over a Lattice of Binary Trees. SIAM J. Comput. 28(5) 1875-1905 (1999)
Authors’ addresses:
S. Foldes
D.S. Parker
UCLA Computer Science Department
S. Radeleczki
University of Miskolc, Institute of Mathematics
