TL;DR
This paper presents an efficient algorithm for counting nonequivalent compact Huffman codes and related structures, significantly improving previous computational bounds by using power series division.
Contribution
It introduces a method to compute the sequence for all n < N with nearly linear complexity in N, surpassing earlier cubic and quartic bounds.
Findings
Efficient computation of the sequence for all n < N
Reduction of complexity from O(N^3) to approximately N^{1+ε}
Applicable to various combinatorial structures related to Huffman codes
Abstract
It is known that the following five counting problems lead to the same integer sequence~: the number of nonequivalent compact Huffman codes of length~ over an alphabet of letters, the number of `nonequivalent' canonical rooted -ary trees (level-greedy trees) with ~leaves, the number of `proper' words, the number of bounded degree sequences, and the number of ways of writing with integers . In this work, we show that one can compute this sequence for \textbf{all} with essentially one power series division. In total we need at most additions and multiplications of integers of bits, , or bit operations, respectively. This improves an earlier bound by Even and Lempel who needed operations in the integer ring or…
| Task | Ring operations | Bit operations |
|---|---|---|
| addition | ||
| multiplication | ||
| division |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Algorithmic counting of nonequivalent compact Huffman codes
Christian Elsholtz
Christian Elsholtz
Institute of Analysis and Number Theory
Graz University of Technology
Kopernikusgasse 24, A-8010 Graz, Austria
,
Clemens Heuberger
Clemens Heuberger
Department of Mathematics
Alpen-Adria-Universität Klagenfurt
Universitätsstraße 65–67, A-9020 Klagenfurt am Wörthersee, Austria
and
Daniel Krenn
Daniel Krenn
Department of Mathematics
Paris Lodron University of Salzburg
Hellbrunnerstraße 34, A-5020 Salzburg, Austria
[email protected]or[email protected]
Abstract.
It is known that the following five counting problems lead to the same integer sequence \mathop{{#1}{}}(n):
- (1)
the number of nonequivalent compact Huffman codes of length over an alphabet of letters, 2. (2)
the number of “nonequivalent” complete rooted -ary trees (level-greedy trees) with leaves, 3. (3)
the number of “proper” words (in the sense of Even and Lempel), 4. (4)
the number of bounded degree sequences (in the sense of Komlós, Moser, and Nemetz), and 5. (5)
the number of ways of writing
[TABLE]
with integers .
In this work, we show that one can compute this sequence for all with essentially one power series division. In total we need at most additions and multiplications of integers of bits (for a positive constant depending on only) or bit operations, respectively, for any . This improves an earlier bound by Even and Lempel who needed operations in the integer ring or bit operations, respectively.
Key words and phrases:
unit fractions, Huffman codes, -ary trees, counting, generating function
2020 Mathematics Subject Classification:
05A15; 05C05, 05C30, 11D68, 68P30
C. Elsholtz is supported by the Austrian Science Fund (FWF): W1230 and by Project Arithrand of the Austrian Science Fund (FWF): I 4945-N and of ANR-20-CE91-0006. C. Heuberger and D. Krenn are supported by the Austrian Science Fund (FWF): P28466-N35.
1. Introduction
Motivation
The purpose of this paper is to study the complexity of a counting problem, namely determining the number of nonequivalent compact Huffman codes of length over an alphabet of letters, and several equivalent combinatorial or number theoretic objects; see below and in particular (1.1) for a precise definition. The fastest algorithm in the published literature is due to Even and Lempel [10] (1972) and has a complexity of operations in the ring of integers.
When actually computing the number of such compact Huffman codes, we experimentally observed that an approach of evaluating a generating function—this generating function was first studied by Flajolet and Prodinger [11] (1987)—appears to be very fast. A detailed analysis (see Theorem 1) shows that the complexity is indeed only (for any ) additions and multiplications of integers of size bits (see (3.1)), where is a positive constant depending on only.
In this paper, we will first describe the different but equivalent objects that we count and then present a quite detailed analysis of computing the number of these objects.
Codes, unit fractions and more
For a fixed integer , Elsholtz, Heuberger and Prodinger [9] studied the number
[TABLE]
, i. e. the number of partitions of into nonpositive powers of . It is known that this counting problem is equivalent to several other counting problems, namely the number of “nonequivalent” complete rooted -ary trees (also called “level-greedy trees”; see [9, 11, 18]), the number of “proper words” (in the sense of Even and Lempel [10]), the number of bounded degree sequences (in the sense of Komlós, Moser, and Nemetz [27]), and the number of nonequivalent compact Huffman codes111A Huffman code over an alphabet of letters is a prefix-free subset (the set of “code words”) of the set of finite words over this alphabet, i.e., no code word is a prefix of another code word. It is said to be compact if no further code word can be added without violating the prefix-freeness condition. Two compact Huffman codes are considered to be equivalent if the multisets of the lengths of the code words are equal, and one can choose a representative where shorter words are lexicographically smaller than longer words.
of length over an alphabet of letters. For a detailed survey on the existing results, applications and literature on these sequences; see [9]. As a small concrete example, we note that for , we have , as can be seen from working out the following:
[TABLE]
As discussed in [9], \mathop{{#1}{}}(r) is positive only when , so it is more convenient to study \mathop{{#1}{}}(n)=f_{t}(1+n(t-1)) instead. For the values of \mathop{{#1}{}}(n) start with
[TABLE]
for with
[TABLE]
and the first terms of \mathop{{#1}{}}(n) are
[TABLE]
These are sequences A002572, A176485 and A176503 in the On-Line Encyclopedia of Integer Sequences [30].
Asymptotics
It has been proved (see Elsholtz, Heuberger, Prodinger [9]) that for fixed , the asymptotic growth of these sequences can be described by two main terms and an error term as
[TABLE]
where . Here all constants depend on . In particular, if , then
[TABLE]
Moreover, the authors of [9] also show that \rho=2-2^{-t-1}+\mathop{{O}{}}\big{(}t\,4^{-t}\big{)} as .
Beside the enumeration of all these objects, probabilistic questions concerning many different parameters have been studied asymptotically in [18, 19].
Algorithmic counting
As this family of sequences appears in many different contexts and as the sequences’ growth rates have been studied in detail (see the section above and the introduction of [9] for full details), it is somehow surprising that the current record on the algorithmic complexity of determining the members of the sequence (in the case ) appears to be a 50 years old paper by Even and Lempel [10]. Hence it seemed worthwhile to study this complexity from a new point of view and we thus succeeded to improve the upper bound complexity considerably; see section below.
The algorithm of Even and Lempel [10] produces the sequence \mathop{{#1}{}}(n) for . It takes additions of integers bounded by (with ; so integers with roughly bits in size), which are bit operations. They only studied the case in detail, but mention that their result can be generalized to arbitrary .
Main result
In this paper we take an entirely new approach to the problem of evaluating . Rather than thinking about an algorithm itself, as Even and Lempel [10] did, we think about how to evaluate the generating function (3.4) of established in [9] efficiently. As it turns out the cost essentially comes from one division of power series of precision222We say that a power series has precision if we can write it as with explicit coefficients . whose coefficients are integers bounded by (with ).
Estimating the cost of this evaluation strategy leads to tremendous improvement—to be precise, by a factor in both ring operations and bit operations—of the cost of using [10]. It is not obvious that the cost for evaluation of numerator and denominator of the generating function are asymptotically (much) smaller than the total cost; see Theorem 1 for details and also Section 5 providing even more details during the proof of this theorem. We in particular show that the cost for evaluating numerator and denominator are asymptotically almost (neglecting logarithmic factors) by a factor smaller.
Using the multiplication algorithms of Schönhage and Strassen [32], of Fürer [13, 14], or of Harvey and van der Hoeven [17] (see Section 2 for an overview) our algorithm leads to operations in the integer ring and consequently bit operations, where denotes the iterated logarithm.333The iterated logarithm (also called log star) gives the number of applications of the logarithm so that the result is at most . For example, we can define it recursively by if and otherwise. In Remark 6.2 a discussion on the memory requirements can be found. An implementation of this algorithm, based on FLINT [12, 16] (which is, for example, included in the SageMath mathematics software [31]) is also available;444The code accompanying this article can be downloaded from https://gitlab.com/dakrenn/count-nonequivalent-compact-huffman-codes. see also Appendix A for the relevant lines of code and remarks related to the implementation. In Appendix B, we discuss the running times of this implementation.
The literature describes a number of algorithms constructing the complete list of -ary Huffman codes of length ; see [20, 24, 28, 29]. There is no performance analysis given. But, as the number of such codes grows exponentially in it is clear that listing all codes is not a fast method to determine the number of such codes only. The algorithm by Even and Lempel [10] computes the number without listing all codes, and is to the best of our knowledge the fastest algorithm previously known. Our algorithm relies on calculations involving power series with large integer coefficients.
It should also be emphasized that the output of the algorithm grows exponentially in (this was mentioned above), therefore the number of bits to represent is linear in whereas the input is only logarithmic in . The quite general survey paper by Klazar [25] studies classes of problems where the output needs at most a polynomial number of steps, in terms of the combined size of input and output. As we can compute efficiently, this problem falls into the class considered by Klazar.
Notes
It should be pointed out that in this article, we derive and compare upper bounds. It might be that the actual cost are smaller. However, as we compute the first coefficients all at the same time and the coefficients grow exponentially in , a lower bound for the number of bit operations necessarily contains a factor . Moreover, as multiplication of some sort is involved, lower order factors (growing with ) are expected as well.
We also mention that the following is open: How fast can a single coefficient \mathop{{#1}{}}(n) (in contrast to all coefficients with ) be computed?
2. Cost of the underlying operations
In this section, we give a brief overview on the time requirements for performing addition and multiplication of two integers and for performing multiplication and division of power series. The current state of the art is also summarized in Table 2.1.
Addition and multiplication
First, assume that we want to perform addition of two numbers bounded by , i.e., numbers with bits. We have to look at each bit of the numbers exactly once and add those (maybe with a carry). Therefore, we need bit operations.
Next, we look at multiplication of two numbers bounded by . It is clear that this can be achieved with operations, but it can be done better. An overview is given in the survey article by Karatsuba [21]. The Karatsuba multiplication algorithm [22, 23] has a complexity of . A faster generalisation of it is the Toom–Cook-algorithm [4]. Combining Karatsuba multiplication with the Fast Fourier Transform algorithm (see Cooley and Tukey [5]) gives an algorithm with bit complexity ; see [1, 2, 3, 26].
The multiplication algorithm given by Schönhage and Strassen (see [32]) takes time. It also uses fast Fourier transform. An asymptotically even faster multiplication algorithm is given by Fürer [13, 14]. It has computational complexity , where we again denote the iterated logarithm by . Fürer’s algorithm uses complex arithmetic. A related algorithm of the same complexity but using modular arithmetic is due to De, Kurur, Saha and Saptharishi [7, 8].
The asymptotically fastest known multiplication algorithm is due to Harvey and van der Hoeven [17]; it has a computational complexity of .
Power series operations
Let us also summarize the complexity of power series computations; for references see the books of Cormen, Leiserson, Rivest and Stein [6] or Knuth [26]. The multiplication can, again, be speeded up by using fast Fourier transform. We can use the algorithms for integer multiplication presented above; see von zur Gathen and Gerhard [33]. Also, the computational complexity can be improved: Given power series with precision (i.e., the first terms) over a ring, we can perform multiplication with ring operations using Fürer’s algorithm.
In order to perform division (inversion) of power series with precision , we can use the Newton–Raphson-method. We need at most 4\mathop{{#1}{}}(N)+N ring operations, where \mathop{{#1}{}}(N) denotes the number of operations needed to multiply two power series with precision ; see von zur Gathen and Gerhard [33, Theorem 9.4] for details; the additional summand \mathop{{#1}{}}(N) in comparison to that theorem comes from the multiplication with the numerator. Therefore, by using Fürer’s algorithm, we can invert/divide with ring operations.
The bit size occuring in the ring operations for a division of power series with precision and coefficients of bit size is by the remarks after [33, Theorem 9.6]. Therefore and by assuming for simpler expressions with respect to the logarithms, we end up with
[TABLE]
bit operations.
3. Cost for extracting coefficients
Our main result gives the number of operations needed for extracting the coefficients \mathop{{#1}{}}(n) for all . It reflects three different aspects: First, we count operations on a high level, for example power series multiplications. (Below we will denote this operation by .) Second, we count operations in the ring of integers. There, to stick with the example on power series multiplication, the precision of the power series is taken into account, but not the actual size of the integer. Finally and third, we count bit operations, where also the size of the coefficients (which are integers) is taken into account.
Let us make this more precise and start with the high level operations. We denote
- •
an addition (or a subtraction) of two power series by ,
- •
a multiplication of two power series by , and
- •
a division of two power series by .
As we compute the first terms, we may assume that all power series are of precision .
An overview and summary of the number of ring operations and bit operations of these high level operations is provided in Section 2. Clearly, we have to deal with the size of the coefficients. We first note that for each coefficient \mathop{{#1}{}}(n) can be written with M\colonequals\lfloor\log_{2}\mathop{{#1}{}}(N)\rfloor+1 bits and that by using the asymptotics (1.2) we can bound this by
[TABLE]
when tends to . Here the constant depends on ; see [9] for details on .
Summarizing, all the operations , and are performed on power series of precision with coefficients written by bits (numbers bounded by ), and the cost (number of bit operations) are stated in Section 2. There is one important remark at this point, namely, we will see during our main proof (Section 5) that the coefficients appearing in power series additions and multiplications are actually much smaller than coefficients written by bits; we will take this into account for counting bit operations.
Beside these main power series operations, we additionally denote
- •
other power series operations of precision (for example, memory allocation or writing initial values) by , and
- •
other operations, more precisely operations of numbers with less than bits (for example additions of indices) by .
Thus, an operation is performed on numbers bounded by only (in contrast to the bounded-by--operations).
With these notions and by collecting operations as formal sums of , , , and , we can write down the precise formulation of our main theorem.
Theorem 1**.**
Calculating the first terms of \mathop{{#1}{}}(n) can be done with
[TABLE]
power series operations,
[TABLE]
operations in the ring of integers, and with
[TABLE]
bit operations.
In order to prove Theorem 1—the complete proof can be found in Section 5,—we look at the cost of calculating the first terms, which is done by extracting coefficients of the power series
[TABLE]
with
[TABLE]
This generating function (3.4) can be found in Flajolet and Prodinger [11, Theorem 2] for and in Elsholtz, Heuberger and Prodinger [9, Theorem 6] for general . It is derived from the equivalent formulation as counting problem on trees, which was mentioned in the introduction.
4. Auxiliary results
When extracting the first coefficients, we do not need the “full” generating function, i.e., the infinite sums in the numerator and denominator of (3.4) can be truncated to finite sums. The following lemma tells us how many coefficients we need. We use this asymptotic result in our analysis of the algorithm; for the actual computer programme, we can check indices and exponents by a direct computation.
Lemma 4.1**.**
To calculate numerator and denominator of the generating function (3.4) with precision , we need only summands with
[TABLE]
Proof.
Because of an additional factor in each summand of the numerator, it is sufficient that the largest index of the denominator is less than . Therefore, we will only look at the indices of the denominator.
Consider the summand of the denominator with index . The lowest index of a non-zero coefficient of the denominator is
[TABLE]
where the notation is defined in Equation (3.5). We only need summands with . Taking the logarithm yields
[TABLE]
As the first logarithm tends to [math] as and the second is bounded, the error term is large enough and the result follows. ∎
While the bit size of the coefficients \mathop{{#1}{}}(n) is linear in , the size of the coefficients of numerator and denominator of (3.4) is much smaller. We make this precise by using the following lemma.
Lemma 4.2**.**
For , the th coefficient of
[TABLE]
with and of Lemma 4.1 as well as the th coefficients of numerator and denominator of (3.4) can be written with
[TABLE]
bits.
Proof.
We start proving the claimed result for (4.1) and postpone handling numerator and denominator of (3.4) to the end of this proof.
Each factor of (4.1) is a geometric series whose coefficients are either [math] or and whose constant coefficient is . In particular, these coefficients are nonnegative. Therefore, it suffices to show the result for .
As the coefficients are either [math] or , the th coefficient of the product equals the cardinality of the set
[TABLE]
By using the crude estimate , we see that we have at most choices for because and by construction. Thus we can bound the cardinality of the set above by
[TABLE]
We use of Lemma 4.1 to obtain
[TABLE]
from which follows that the th coefficient of (4.1) is bounded by
[TABLE]
The result in terms of bit size follows by taking the logarithm.
Numerator and denominator are sums where summands are added up (or subtracted). This corresponds to an additional factor in the bound (4.3) or an additional summand in the formula (4.2), respectively. As by Lemma 4.1, this is absorbed by the error term, so the same formula holds. ∎
5. Proof of Theorem 1
We start with an overview of our strategy. For computing the first coefficients of the generating function (see (3.4)), we only need the summands of the numerator and the denominator with according to Lemma 4.1.
First, consider the denominator of . We compute the products
[TABLE]
iteratively by expanding the different terms as geometric series and perform power series multiplications. After each multiplication, we accumulate the result by using one power series addition.
We deal with the numerator in the same fashion. However, by performing the computation of numerator and denominator simultaneously, the above products only need to be evaluated once.
Finally, to obtain the first coefficients of , we need one power series division of numerator and denominator.
Pseudocode for our algorithm is given in Algorithm 1; an efficient implementation using the FLINT library is presented in Appendix A. The actual analysis of this algorithm is done by counting the operations needed, in particular the power series operations, and providing bounds for the bit sizes of the variables.
Let us come to the actual proof.
Proof of Theorem 1.
We analyse the code of Algorithm 1; see Appendix A for the details. It starts by initialising variables (memory allocation and initial values) for the power series operations, which contributes . Further initialisation is done by operations.
For computing the first coefficients of (see (3.4)), we only need the summands of numerator and denominator with according to Lemma 4.1. Speaking in terms of our computer programme, our outer loop needs passes. We now describe what happens in each of these passes; the final cost needs then to be multiplied by .
Suppose we are in step . After some update of auxiliary variables (needing operations), we compute the product
[TABLE]
out of the product with factors up to index . Expanding as geometric series contributes at most and performing a power series multiplication contributes and additionally one swap . For obtaining the number of bit operations, we need estimates of the coefficients appearing in the multiplication. Lemma 4.2 bounds their value by
[TABLE]
bits. Therefore each of our power series multiplications needs
[TABLE]
bit operations by the results of Fürer [13, 14] and Harvey and van der Hoeven [17]; see also Section 2.
After each multiplication, we accumulate the results for numerator and denominator by using one power series addition for each of the two. For the numerator, we additionally need operations for the multiplication by performed by shifting. Concerning bit operations, we use the bound of the coefficients for numerator and denominator provided by Lemma 4.2. In terms of bit size, this leads to the number of bits given in (5.1). Therefore a power series addition needs
[TABLE]
bit operations.
In total, we end up with
[TABLE]
operations to evaluate the outer loop; these operations translate to
[TABLE]
operations in the ring of integers and to
[TABLE]
bit operations.
We are now ready to collect all costs for proving the first part of Theorem 1. Additionally to the above, we divide the numerator by the denominator and need one power series division . The clean-up accounts to . This yields (3.2).
Using the Newton–Raphson-method and Fürer’s algorithm (see Section 2 and Table 2.1) a power series division results in
[TABLE]
operations in the ring. Its operands555The actual bit size during the division is ; see the end of Section 2 for details. have bit size
[TABLE]
which results in
[TABLE]
bit operations for our computations.
We note that the number of bit operations of a power series operation is linear in as the coefficients are bounded and that is an operation on numbers with bits. The error term includes all these. Collecting all bit operation results gives the upper bound (3.3). ∎
6. Remarks
In this last section, we provide some remarks related to the above proof and coefficient extraction algorithm.
Remark 6.1*.*
In the proof above, we have seen that the cost (bit operations) of the power series division is asymptotically roughly (not taking logarithms and smaller factors into account) a factor larger than the cost for computing numerator and denominator, and all the overhead cost.
Moreover, only focusing on the computation of numerator and denominator, the costs (again bit operations) for computing these two are asymptotically dominated by power series multiplication, albeit only by roughly (again not taking into account logarithmically smaller factors) a factor compared to addition and other power series operations.
Note that when only considering operations in the integer ring, then the multiplications performed in the evaluation of numerator and denominator take the asymptotically leading role by operations compared to ring operations of the power series division.
At the end of this article we make a short remark on the memory requirements for the presented coefficient extraction algorithm.
Remark 6.2*.*
Our algorithm needs units of memory—a unit stands for the memory requirements of storing a number bounded by —plus the memory needed for the power series multiplication and division.666We have been unable to find a reference for the memory requirements of, for example, the Schönhage–Strassen-algorithm. It seems that the GNU Multiple Precision Arithmetic Library (GMP) can do this with units of memory; see [15] for a comment of one of its authors. The above means that we can bound the memory requirements by bits.
Appendix A Code
Below are the relevant lines of a programme written in C for computing the coefficients \mathop{{#1}{}}(n) with . The code can be found at https://gitlab.com/dakrenn/count-nonequivalent-compact-huffman-codes. The programme uses FLINT [12, 16]. Note that we do not use aliasing of input and output arguments in multiplication because providing our own auxiliary polynomial brings tiny performance improvements.
Appendix B Timing
The table below contains timings (in seconds) for computing the first coefficients with .
[TABLE]
Here, is the time for generating numerator and denominator, for the one power series division and .
The benchmark was executed on an Intel(R) Xeon(R) CPU E5-2630 v3 at 2.40GHz. The limiting factor for our computations is the memory requirement; it is the reason computing at most coefficients.
The timings in the table and the theoretical result of this article fit together; we can see the running time of the algorithm in our implementation.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Allan Borodin and Ian Munro, The computational complexity of algebraic and numeric problems , American Elsevier Publishing Co., Inc., New York-London-Amsterdam, 1975, Elsevier Computer Science Library; Theory of Computation Series, No. 1.
- 2[2] Jonathan M. Borwein, Peter B. Borwein, and David H. Bailey, Ramanujan, modular equations, and approximations to pi, or how to compute one billion digits of pi , Amer. Math. Monthly 96 (1989), 201–219.
- 3[3] E. Oran Brigham, The Fast Fourier transform , Prentice-Hall, Englewood Cliffs, NJ, 1974.
- 4[4] Stephen A. Cook, On the minimum computation time of functions , Ph.D. thesis, Harvard University, 1966.
- 5[5] James William Cooley and Tukey John Wilder, An algorithm for the machine calculation of complex Fourier series , Math. Comput. 19 (1965), 297–301.
- 6[6] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein, Introduction to algorithms , second ed., The MIT Press, 2001.
- 7[7] Anindya De, Piyush P. Kurur, Chandan Saha, and Ramprasad Saptharishi, Fast integer multiplication using modular arithmetic , STOC’08: Proceedings of the fortieth annual ACM symposium on Theory of computing, ACM, New York, 2008, pp. 499–505. · doi ↗
- 8[8] Anindya De, Piyush P. Kurur, Chandan Saha, and Ramprasad Saptharishi, Fast integer multiplication using modular arithmetic , SIAM J. Comput. 42 (2013), no. 2, 685–699. · doi ↗
