Planar digraphs for automatic complexity
Achilles A. Beros, Bj{\o}rn Kjos-Hanssen, Daylan Kaui Yogi

TL;DR
This paper demonstrates that the digraphs representing nondeterministic finite automata for automatic complexity can always be made planar, and studies the counting function of words with fixed complexity, showing it stabilizes and is computable.
Contribution
It proves planarity of automaton digraphs for automatic complexity and analyzes the asymptotic behavior of the number of words with fixed complexity.
Findings
Automaton digraphs for automatic complexity can always be planar.
The function counting words with fixed complexity stabilizes and is computable.
Planarity can fail in total transition functions studied previously.
Abstract
We show that the digraph of a nondeterministic finite automaton witnessing the automatic complexity of a word can always be taken to be planar. In the case of total transition functions studied by Shallit and Wang, planarity can fail. Let be the number of binary words of length having nondeterministic automatic complexity . We show that is eventually constant for each and that the eventual constant value of is computable.
| # | %complex | -# | ||
| 0 | 1 | 1 | 100.00% | 0 |
| 1 | 2 | 2 | 100.00% | 0 |
| 2 | 2 | 4 | 50.00% | 2 |
| 3 | 6 | 8 | 75.00% | 2 |
| 4 | 8 | 16 | 50.00% | 8 |
| 5 | 24 | 32 | 75.00% | 8 |
| 6 | 30 | 64 | 46.88% | 34 |
| 7 | 98 | 128 | 76.56% | 30 |
| 8 | 98 | 256 | 38.28% | 158 |
| 9 | 406 | 512 | 79.30% | 106 |
| 10 | 344 | 1,024 | 33.59% | 680 |
| 11 | 1,398 | 2,048 | 68.26% | 650 |
| 12 | 1,638 | 4,096 | 39.99% | 2,458 |
| 13 | 5,774 | 8,192 | 70.48% | 2,418 |
| 14 | 5,116 | 16,384 | 31.23% | 11,268 |
| 15 | 23,018 | 32,768 | 70.25% | 9,750 |
| 16 | 22,476 | 65,536 | 34.30% | 43,060 |
| 17 | 86,128 | 131,072 | 65.71% | 44,944 |
| 18 | 89,566 | 262,144 | 34.17% | 172,578 |
| 19 | 351,250 | 524,288 | 67.00% | 173,038 |
| 20 | 375,710 | 1,048,576 | 35.83% | 672,866 |
| 21 | 1,461,670 | 2,097,152 | 69.70% | 635,482 |
| 22 | 1,539,164 | 4,194,304 | 36.70% | 2,655,140 |
| 23 | 5,687,234 | 8,388,608 | 67.80% | 2,701,374 |
| 24 | 6,814,782 | 16,777,216 | 40.62% | 9,962,434 |
| 25 | 24,031,676 | 33,554,432 | 71.62% | 9,522,756 |
| 26 | 27,782,964 | 67,108,864 | 41.40% | 39,325,900 |
| 27 | 97,974,668 | 134,217,728 | 73.00% | 36,243,060 |
| 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
| 22 | 8/20 | 28/58 | 86/164 | 322/502 | 1288/2846 | 6594/16024 | 44922/94732 | 220544/451368 |
| 21 | 8/20 | 28/58 | 98/176 | 292/496 | 1318/3168 | 8472/18720 | 52178/108042 | 266760/504794 |
| 20 | 8/20 | 28/58 | 86/164 | 238/430 | 1478/3814 | 11670/23328 | 54990/115896 | 278696/529148 |
| 19 | 8/20 | 28/58 | 86/164 | 402/582 | 2380/4996 | 12312/26542 | 78892/410668 | 134578/351250 |
| 18 | 8/20 | 28/58 | 110/188 | 356/598 | 2070/5692 | 14456/29990 | 68288/36024 | 0/0 |
| 17 | 8/20 | 28/58 | 104/200 | 262/514 | 2850/7102 | 20516/37042 | 30486/86128 | |
| 16 | 8/20 | 28/58 | 80/164 | 536/752 | 2908/7738 | 14230/34320 | 0/22476 | |
| 15 | 8/20 | 28/58 | 148/226 | 578/908 | 3338/8530 | 7524/23018 | ||
| 14 | 8/20 | 28/58 | 112/244 | 774/1270 | 4442/9868 | 0/5116 | ||
| 13 | 8/20 | 28/58 | 120/250 | 1396/2076 | 1736/5774 | |||
| 12 | 8/20 | 28/58 | 158/282 | 1048/2090 | 0/1638 | |||
| 11 | 8/20 | 28/58 | 384/564 | 576/1398 | ||||
| 10 | 8/20 | 34/64 | 244/588 | 0/344 | ||||
| 9 | 8/20 | 48/78 | 112/406 | |||||
| 8 | 8/20 | 82/130 | 0/98 | |||||
| 7 | 10/22 | 38/98 | ||||||
| 6 | 14/26 | 0/30 | ||||||
| 5 | 8/24 | |||||||
| 4 | 0/8 |
| 1 | 1 | 21 | 64 594 576 |
| 2 | 3 | 22 | 141 046 655 |
| 3 | 10 | 23 | 306 858 874 |
| 4 | 29 | 24 | 665 342 837 |
| 5 | 82 | 25 | 1 438 134 475 |
| 6 | 215 | 26 | 3 099 548 927 |
| 7 | 556 | 27 | 6 662 442 946 |
| 8 | 1 385 | 28 | 14 285 118 725 |
| 9 | 3 391 | 29 | 30 557 828 119 |
| 10 | 8 135 | 30 | 65 225 030 201 |
| 11 | 19 261 | 31 | 138 937 277 596 |
| 12 | 44 963 | 32 | 295 385 810 819 |
| 13 | 103 906 | 33 | 626 867 939 224 |
| 14 | 237 719 | 34 | 1 328 075 901 017 |
| 15 | 539 458 | 35 | 2 809 126 944 436 |
| 16 | 1 214 993 | 36 | 5 932 793 909 801 |
| 17 | 2 718 760 | 37 | 12 511 847 996 740 |
| 18 | 6 047 426 | 38 | 26 350 575 690 893 |
| 19 | 13 380 766 | 39 | 55 423 630 773 538 |
| 20 | 29 463 632 | 40 | 116 429 658 505 697 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
11institutetext: University of Hawai‘i at Mānoa, Honolulu HI 96822, USA 11email: {beros,bjoernkh,dkyogi64}@hawaii.edu
Planar digraphs for automatic complexity††thanks:
This work was partially supported by a grant from the Simons Foundation (#315188 to Bjørn Kjos-Hanssen). We are indebted to Jeff Shallit and Malik Younsi for helpful comments.
Achilles A. Beros
Bjørn Kjos-Hanssen
Daylan Kaui Yogi
Abstract
We show that the digraph of a nondeterministic finite automaton witnessing the automatic complexity of a word can always be taken to be planar. In the case of total transition functions studied by Shallit and Wang, planarity can fail.
Let be the number of binary words of length having nondeterministic automatic complexity . We show that is eventually constant for each and that the eventual constant value of is computable.
Keywords:
Automatic complexity planar graph Möbius function Nondeterministic finite automata.
1 Introduction
Automatic complexity, introduced by Shallit and Wang [7], is an automata-based and length-conditional analogue of Sipser’s CD complexity [8] which is in turn a computable analogue of the noncomputable Kolmogorov complexity. The nondeterministic case was taken up by Hyde and Kjos-Hanssen [3], who gave a table of the number of words of length of a given complexity for . The numbers in the table suggested (see Table 2) that the number may be eventually constant for each fixed . Here we establish that that is the case (Theorem 9), and show that the limit is computable (in exponential time). Moreover, we narrow down the possible automata that are needed to witness nondeterministic automatic complexity: they must have planar digraphs, in fact their digraphs are trees of cycles in a certain sense.
We recall our basic notion.
Definition 1** ([7])**
The nondeterministic automatic complexity of a word is the minimal number of states of a nondeterministic finite automaton (without -transitions) accepting such that there is only one accepting path in of length .
2 Automatic complexity as chains of trees of lumps
Consider the version of automatic complexity where the transition functions are not required to be total.111 Whether determinism is required is not important in the following, but in the nondeterministic case we assume we require there to be only one accepting path, as usual.
Then we claim that the digraphs representing the witnessing automata are planar, in fact they are “trees of cycles”. As an example, for the word , we have the following witnessing automaton:
[TABLE]
To explain this, first let us say that a cycle is a sequence of states that starts and ends with the same state. Let us say that a lump is the automaton whose transitions come from a given cycle. So if a cycle is repetitive, like 3456734567345673, then it generates the same lump as just 345673.
Consider the sequence of states visited during processing of a unique accepted word x of length n. Let us call the first visited state 0, the next distinct state 1, and so on. (So for example the permitted state sequences of length 3 are only 000, 001, 010, 011, 012.)
Then the state sequence starts where is the first state that is visited twice. Now the claim is that there will never, at a later point in the state sequence, be a transition (an edge) , such that occurs within the lump generated by the cycle and such that the transition does not occur in that lump. Indeed, otherwise our state sequence would start
[TABLE]
and then there is a second accepting path of the same length where the first and second segments are switched.
Consequently, the path can only return to states that are not yet in any lumps. This leaves only two choices whenever we decide to create a new edge leading to a previously visited state:
Case 1. Go back to a state that was first visited after the last completed lump so far seen, or Case 2. Go back to a state that was first visited at some earlier time, before some of the lumps so far seen started (and in general after some of them were complete).
This gives a tree of lumps where each new lump either (Case 1) creates a new sibling for the previous lump, or (Case 2) creates a new parent for a final segment of the so far seen top-level siblings. In this tree of lumps, only the leaves (the lumps that are not anybody’s parents) can be traversed more than once by the uniquely accepted path of length n.
So if the first lump created is then next we can have two cases:
[TABLE]
[TABLE]
In Case 1, and are siblings ordered from first to second. In Case 2, denotes is a child of, which by definition is the same as sub-digraph. Now for the third lump , we have only the following possibilities:
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
In Subcase 1.2, and are siblings and is a child of . In Subcase 1.3, is a common parent of and . In Subcase 2.1, is a new sibling for , and still has as its child. In Subcase 2.2, is a parent of .
For instance, the state sequence 01234567345673456720 has the structure of Subcase 2.2, with being the lump generated from 345673, being generated from 23456734567345672, and being generated from the whole sequence 01234567345673456720. The corresponding automaton is shown in an online tool.222 http://math.hawaii.edu/wordpress/bjoern/complexity-of-0001111011110111111/
Using this planarity result, we are able to increase the speed of our algorithm for calculating . Consequently, we have been able to extend the string length in our computations from to . The number of maximally complex binary words of a given length are shown in Table 1. A similar table for was given in [3].
3 The asymptotic number of words of given complexity
In this section, we examine the asymptotic behavior of the number of words with automatic complexity for a fixed .
Definition 2
A binary word is right inextendible if and .
Inextendibility is closely related to volatility of the automatic complexity, as examined in the Complexity Option Game [5]. The number and proportion of right-inextendible words of length and complexity can be examined using an online database [4] and is shown in Table 2 for small and .
A basic procedure in our results will be the counting of periodic words, since a cycle containing a periodic word can be shortened and an automaton containing such a cycle will not be optimal.
Definition 3
A word is periodic if there exists a subword and an integer such that
[TABLE]
A non-periodic word [2] is also called a primitive word and one starting with 0, in our setting, is called a Lyndon word [6].
Definition 4** ([1])**
Let be a positive integer with denoting the number of distinct prime factors of and denoting the total number of prime factors (i.e., with repetition) of . The Möbius function is defined as
[TABLE]
Theorem 5** ([2])**
The number of unique periodic binary words of length is given by and for ,
[TABLE]
Recall that a necklace is an equivalence class of non-periodic words under cyclic rotation. Thus, for instance, is a necklace. Theorem 5 is a restatement of the following classical result.
Theorem 6** (Witt’s Formula [9])**
The number of necklaces of binary words of length is
[TABLE]
[FIGURE:]
Definition 7
We define the set and .
Definition 8
Given an automaton, , whose set of states is , we define a detour to be a pair of finite non-trivial sequences of states, , such that , and . We call a detour minimal if .
Consider an automaton with a single cycle (Figure 5). Suppose the automaton has states before the cycle and states after the cycle (which implies that there are states within the cycle). We now obtain a formula for the limit of the number of binary words of given complexity .
Theorem 9
* is eventually constant, with limiting value*
[TABLE]
where was defined in Theorem 5 and
[TABLE]
Proof
Consider an arbitrary automaton with states. There are a finite number of such automata. We will prove that unless has at most one minimal detour, there is an such that, for all , cannot accept a unique word of length .
We begin with the observation that we may assume that has a unique initial state and a unique accepting state.
If has at most one detour, then has one of the the following forms.
[TABLE]
If is of the type on the right and accepts a unique word of length , then any accepting path for either uses the states that comprise the top path of the detour, or uses the states that comprise the bottom path, but no both. Thus, if both and are non-zero, there is an automaton with fewer states that accepts only among all words of length . We conclude that in the case of automata with at most one minimal detour, we need only consider ones of the form on the left.
Now, we consider the possibilities for automata with at least two distinct minimal detours.
Each of the twelve cases in Figure 1 falls into one of three cases.
On any accepting path, each detour can be used at most once ((1), (2) and (3)). 2. 2.
On any accepting path, one of the detours can be used at most once ((7), (8), (10), (11) and (12)). 3. 3.
There are accepting paths that use each of the detours an arbitrary number of times ((4), (5), (6) and (9)).
These further break down as follows:
- •
(1), (4), (7), (10) represent two separated cycles;
- •
(2), (5), (8), (11) represent overlapping cycles.
- •
(3), (6), (9), (12) represent nested cycles; and
If falls into the first case, then is also uniquely accepted among words of length by an automaton with at most states and no detours. If falls into the second case, then is uniquely accepted by an automaton with at most states and at most one detour. If falls into the third category, then there are two cycles (although they may have common transitions) which can each be traversed and independent and arbitrary number of times on an accepting path. Thus, for large enough , the cycles can be traversed in different orders or different numbers of times and still reach an accepting state, thereby violating the requirement that accept exactly one word of length .
As an example of the third case, suppose that is of the type shown in (9). has two independent cycles, one of length and the other of length . Let , where . There are at least two words of length that accepts, and for any such that accepts a word of length , must accept at least two words of length .
In conclusion, we may assume our automata have at most one detour. Thus they consist of a chain of states, followed by a single (in general multi-state) cycle, followed by another chain. Let be the number of states before the cycle, the number of states after the cycle, so that if the number of states within the cycle. If the bits read within the cycle do not form a necklace, we can reduce the number of states. Thus there are states within the cycle. The an upper bound for the total number of binary words with is
[TABLE]
Let be the bit that advances the automaton from the th state to the th state (i.e. the transition that takes the automaton into the cycle) and be the bit that advances that automaton from the th state to the th state (i.e, the transition that completes the cycle). If , then it is possible to create an automaton with fewer states that accepts the same word and no other of length . A similar consideration applies upon leaving the cycle. Thus, we have
[TABLE]
possible words.
Finally, to conclude that is eventually constant, note that while the single cycle will have to be exited at different points depending on mod , where is the length of the main cycle, there will always be exactly one value of mod and hence exactly one automaton contributed from the cycle and the given “head” and “tail” words. See Figures 2, 3, and 4 for illustrations of the cases , respectively.
Remark 10
Here is perhaps a simpler view of the classification of detours in Figure 1. Suppose is an NFA that uniquely accepts some word. Now consider some shortest directed path from to the unique final state . Let us say that an alternate route is any simple directed path, edge-disjoint from , joining two vertices of .
Suppose there are two alternate routes, and , joining and , and and , respectively. If we do not worry about the direction of the paths for the moment, we may assume and . Then there are three possibilities:
: precedes ; 2. 2.
* and : encompasses ;* 3. 3.
: and overlap.
Furthermore, for and one can choose the direction of the edges independently. This gives possibilities to consider.
The main proviso to Theorem 9 may be that while the number of words with given complexity reaches a limit, the set of witnessing automata does not quite. To wit:
Theorem 11
There is a such that there is no set of automata such that for all sufficiently large ,
- •
for each there is some of length such that and witnesses the inequality , and
- •
for all of length , iff the inequality is witnessed by one of the .
Proof
Let . The limiting value of is 6 as witnessed by the patterns: , , . However, for , different states will be the final state depending on the length mod 2; see Figure 2.
Theorem 12** (Number of right-inextendible words)**
For , define a function by
[TABLE]
Then is eventually constant, with limiting value
[TABLE]
where refers to the function defined in Theorem 2, and .
Proof
Let be a binary word such that its accepting automaton has a single cycle, as in Figure 5. As shown in Theorem 9, we need only consider this particular case. Let be the number of states between the cycle and the accepting state of the automaton.
Suppose . Then the accepting state must be one of the states within the cycle. Without loss of generality, suppose the path out of the accepting state is triggered by a 0 input. Then must have the same automatic complexity as , as appending 0 to does not require the addition of any additional states, and is thus not inextendible. Thus, for a word to be inextendible, it is necessary that .
Theorem 13
* is eventually bounded by .*
Proof
By Theorem 9, we can upper bound the sum by
[TABLE]
In fact, by considering the four possible truth values for the cases , , we get the upper bound
[TABLE]
[TABLE]
Remark 14
A comparison of with the bound in Theorem 13 can be done using the computer code in Figure 6. The number in the title of this section was calculated using that Python script and using a table of values of from the OEIS database. Table 3 shows an initial segment of the resulting sequence. There we count only words starting with 0, so that the full number would be twice that, matching the impression that given by Table 2.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Bender, E.A., Goldman, J.R.: On the applications of Möbius inversion in combinatorial analysis. Amer. Math. Monthly 82 (8), 789–803 (1975). https://doi.org/10.2307/2319793, https://doi.org/10.2307/2319793 · doi ↗
- 2[2] Choi, J.S.: Counts of unique periodic binary strings of length n. http://oeis.org/A 152061 (Sep 2011)
- 3[3] Hyde, K.K., Kjos-Hanssen, B.: Nondeterministic automatic complexity of overlap-free and almost square-free words. Electron. J. Combin. 22 (3) (2015), paper 3.22, 18
- 4[4] Kjos-Hanssen, B.: Complexity lookup. http://math.hawaii.edu/wordpress/bjoern/complexity-of-0110100110010110/
- 5[5] Kjos-Hanssen, B.: Complexity option game. http://math.hawaii.edu/wordpress/bjoern/complexity-option-game/
- 6[6] Lyndon, R.C.: On Burnside’s problem. Trans. Amer. Math. Soc. 77 , 202–215 (1954). https://doi.org/10.2307/1990868, https://doi.org/10.2307/1990868 · doi ↗
- 7[7] Shallit, J., Wang, M.W.: Automatic complexity of strings. J. Autom. Lang. Comb. 6 (4), 537–554 (2001), 2nd Workshop on Descriptional Complexity of Automata, Grammars and Related Structures (London, ON, 2000)
- 8[8] Sipser, M.: A complexity theoretic approach to randomness. In: Proceedings of the Fifteenth Annual ACM Symposium on Theory of Computing. pp. 330–335. STOC ’83, ACM, New York, NY, USA (1983). https://doi.org/10.1145/800061.808762, http://doi.acm.org/10.1145/800061.808762
