Parikh-reducing Church-Rosser representations for some classes of regular languages
Tobias Walter

TL;DR
This paper investigates Parikh-reducing Church-Rosser systems for specific regular language classes, providing finite representations and analyzing their complexity, especially for languages with abelian group syntactic monoids.
Contribution
It demonstrates the existence of finite Parikh-reducing Church-Rosser systems for certain regular languages and constructs monoid representations with abelian subgroups.
Findings
Existence of finite systems for languages with abelian group syntactic monoids
Construction of monoid representations with all subgroups abelian
Analysis of the complexity of these representations
Abstract
In this paper the concept of Parikh-reducing Church-Rosser systems is studied. It is shown that for two classes of regular languages there exist such systems which describe the languages using finitely many equivalence classes of the rewriting system. The two classes are: 1.) the class of all regular languages such that the syntactic monoid contains only abelian groups and 2.) the class of all group languages over a two-letter alphabet. The construction of the systems yield a monoid representation such that all subgroups are abelian. Additionally, the complexity of those representations is studied.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicssemigroups and automata theory · Coding theory and cryptography · Chemical Synthesis and Analysis
Parikh-reducing Church-Rosser representations for some classes of regular languages
Tobias Walter111Supported by the German Research Foundation (DFG) under grant DI 435/6-1.
FMI
University of Stuttgart
Abstract
In this paper the concept of Parikh-reducing Church-Rosser systems is studied. It is shown that for two classes of regular languages there exist such systems which describe the languages using finitely many equivalence classes of the rewriting system. The two classes are: 1.) the class of all regular languages such that the syntactic monoid contains only abelian groups and 2.) the class of all group languages over a two-letter alphabet. The construction of the systems yield a monoid representation such that all subgroups are abelian. Additionally, the complexity of those representations is studied.
1 Introduction
The class of Church-Rosser congruential languages has been introduced by Narendran, McNaughton and Otto in 1988, see [Nar84, MNO88]. A language is Church-Rosser congruential if it is a finite union of equivalence classes of a finite length-reducing Church-Rosser rewriting system. It is natural to ask whether every regular language is Church-Rosser congruential. After some initial progress [Nie00, NW02, RT03, DKW12], this question has been solved affirmatively, see [DKRW15]. The main idea of the solution in [DKRW15] is to prove a stronger statement. Instead of proving that for every regular language there exists a length-reducing Church-Rosser system which saturates the language it is proved that for every regular language and every weight function there exists such a weight-reducing Church-Rosser system. In particular, the initial problem is included by choosing length as the weight function. This result on regular languages became possible by utilizing the concept of local divisors. In this paper we use the same technique of local divisors to study a stronger property. Instead of requiring weight-reducing systems for a given weight we ask the question whether for every regular language there exists a Church-Rosser system which saturates the language and is weight-reducing for every weight function. We call such a rewriting system a Parikh-reducing Church-Rosser system. Some of the initial progress already satisfied the Parikh-reducing condition, namely the construction for aperiodic languages [DKW12], for languages of polynomial density [Nie00] and for cyclic groups of order two [NW02]. Our result comprises these results. Namely, the following is the main result: for every language such that its syntactic monoid contains only abelian groups there exists a Parikh-reducing Church-Rosser system which saturates the language. Moreover, all groups appearing in the corresponding Church-Rosser representation are abelian. Furthermore, we show the existence of such Parikh-reducing systems for all group languages over a two-letter alphabet. Having established the existence of Parikh-reducing systems we study the size of the resulting Church-Rosser representations. Naively, analyzing the construction yields a non-primitive function for this size. We introduce an alphabet reduction technique which reduce the size of the resulting Church-Rosser representations to a quadruple exponential function. On the other side of the spectrum we prove an exponential lower bound for cyclic groups.
2 Preliminaries
Words and Languages
An alphabet is a non-empty finite set . An element of is called a letter. A (finite) word is a finite concatenation of letters . The set of finite words with letters in is denoted by . The empty word is denoted by . The set of finite words forms a monoid with the concatenation operation, the free monoid. Let be a function with for all . The unique homomorphism, which extends , is also denoted by and called a weight. A special weight is length which is induced by for all . For a letter we also define to be the homomorphism which is induced by
[TABLE]
We set to be the set of words of length at most .
A language is a subset of . Let be a homomorphism in a finite monoid . A language is recognized by if . A language is regular if it can be recognized by some homomorphism in a finite monoid.
Algebra
We want to study subclasses of regular languages which are characterized by special classes of monoids. A variety is a class of finite monoids which is closed under taking submonoids, homomorphic images and finite direct products. In particular, taking the empty direct product, every variety contains the trivial monoid. A variety which contains only groups is called a variety of groups. We assign every variety a corresponding language class such that if and only if there exists a monoid and a homomorphism that recognizes . Examples of such varieties include the variety of all groups and the variety of all abelian groups.
Let be a variety of finite groups. We define
[TABLE]
to be the maximal class of monoids whose subsemigroups, which are groups, are in . It turns out that is the maximal variety such that , see [Eil76, Proposition V.10.4]. Our main result is concerned with the language class . An important concept used in this paper are local divisors. Let be a monoid and . We set and introduce a multiplication on given by . Since and , the result of is in . The structure forms a monoid, the local divisor of at . Indeed, is a divisor of , that is, a homomorphic image of a submonoid of , see [DK15]. If is not a unit, then since .
Combinatorics on Words
Let be a word. Then we call a prefix, a factor and a suffix of . The factor is proper if and are not empty. The set of factors is given by . The word , with , is a subword of a word if . The word is a power of the word if for some . Let be a word with letters. We say that is a period of if for all . The theorem of Fine and Wilf describes an important property of periods.
Theorem 2.1** (Fine and Wilf, [FW65]).**
Let be periods of some word . If , then is a period of .
A word is called primitive if it is only a power of itself, that is, if with implies . The following well-known characterization of primitive words will be useful.
Lemma 2.2**.**
A word is primitive if and only if is not a proper factor of .
Rewriting systems
A semi-Thue system over the alphabet is a finite subset of . An element is called a rule, where is the left side and is the right side of the rule. The idea of a semi-Thue system is, that left sides of rules can be replaced by right sides of the rule. Thus, one often also calls a semi-Thue system a rewriting system. For a semi-Thue system we define the rewriting relation given by , that is, if results from by replacing the left side of a rule with the right side. The reflexive transitive closure of is denoted by and the symmetric closure of is denoted by . We write for . A semi-Thue system is confluent or Church-Rosser, if and imply that there exists a word such that and . It is locally confluent, if and imply that there exists a word such that and . It is weight-reducing for a weighted alphabet , if for all rules and it is Parikh-reducing, if for all and all rules it holds and for all rules there exists a letter such that . Furthermore, is subword-reducing, if and is a subword of for each rule .
The notion Parikh-reducing comes from the connection to Parikh images. A Parikh image of a word is the vector . A semi-Thue system is Parikh-reducing if and only if the Parikh image is smaller than for every rule . By definition every subword-reducing system is Parikh-reducing. Further, it is rather easy to see that a semi-Thue system is Parikh-reducing if and only if it is weight-reducing for every weight .
A classical lemma states that is confluent if it is Parikh-reducing and locally confluent, see [BO93]. In the following we study different cases which may occur when checking for local confluence. Let be two rules and consider the word . Then
u\ell v\ell^{\prime}w$$urv\ell^{\prime}w$$u\ell vr^{\prime}w$$urvr^{\prime}w$$S$$S$$S$$S
Thus, checking for local confluence in this case is trivial. The only non-trivial cases appear when two rules overlap. There are two different kinds of overlaps:
, 2. 2.
for rules . The resulting pairs and are called critical pairs. The first kind is called overlap critical and the second kind is called factor critical, see also Figure 1.
We say that a critical pair resolves if there exists a word such that holds. Summarized, we obtain the following:
Lemma 2.3** ([KB70]).**
A semi-Thue system is locally confluent if and only if all its critical pairs resolve.
Lemma 2.3 will be used without explicitly referring to it.
A word is irreducible in if no left-side of a rule in appears in . We denote the set of irreducible elements of by . The relation is a congruence on . Thus, one can consider the monoid . The elements of are equivalence classes of the congruence . The number of elements in is called index of . If is Parikh-reducing and (locally) confluent, then there is a bijection between and . In this case, we denote elements of the monoid with the corresponding irreducible words. In fact, we call a locally confluent Parikh-reducing system a Parikh-reducing Church-Rosser system. Let be a homomorphism and be a semi-Thue system. We say that factorizes through if for all it holds , that is, equivalence classes of map to the same element in . We also say that is -invariant if factorizes through . This notion is algebraically motivated. Let be a semi-Thue system such that factorizes through , then given by is a well-defined homomorphism. Let be the natural projection and be some language which is recognized by and be the syntactic homomorphism of . Then we obtain the situation in Figure 2. In particular, recognizes .
Since factorizes through if and only if factorizes through , we may assume that is surjective. If further is a Church-Rosser system, we call a Church-Rosser representation of (or ).
3 Parikh-reducing Church-Rosser systems
3.1 Outline
In this subsection we give an outline on the proof strategy which will be used in Theorem 3.2. The macro structure of the proof is as follows: Given a homomorphism , we construct a system which is -invariant by induction on . The construction is based on the following lemma:
Lemma 3.1** ([DKW12, DKRW15]).**
Let be an alphabet of size at least two, be a homomorphism and for some . Assume that is a Parikh-reducing Church-Rosser system of finite index which is -invariant. Let be a new alphabet and be a Parikh-reducing Church-Rosser system of finite index such that
[TABLE]
is -invariant. Then
- a)
* is a -invariant Parikh-reducing Church-Rosser system of finite index.* 2. b)
All groups in are contained in or in . 3. c)
The index of is .
Proof.
a) is proved in [DKRW15]. By [DKW12], is a so-called Rees extension monoid and the statement of b) follows from general properties of Rees extension monoids, see [AK16].
It remains to calculate the size of the index of . Every irreducible word in which contains no is contained in . Conversely, every element of is irreducible in the rewriting system given by . Every irreducible word in which contains at least one is of the form for and . By the definition of the rule set every such word is also irreducible. This shows that there are exactly irreducible words in which contains at least one . ∎
For a fixed letter we remove and obtain the alphabet . Inductively, one obtains a system which factorizes through . Now, consider a new alphabet . By Lemma 3.1, it remains to construct a system . The system contains two kinds of rules: -rules and -rules. The idea of these rules is to deal with different kind of words. The set of -rules deals with long repetitions of short words. Whenever there is no long repetition of short words, this is witnessed by a marker word . The set of -rules contains rules of the form for some normal forms . Lemma 3.6 shows that such rules appear for sufficiently large words and Lemma 3.9 shows the confluence of the constructed system.
3.2 Commutative Groups
In this section we study Parikh-reducing Church-Rosser systems for abelian groups. Let be a homomorphism in an abelian group . We construct a system for by sorting the letters and then reducing them modulo their order. Thus, we actually construct a Church-Rosser representation for the group . The situation obtained in Theorem 3.2 is shown in the commutative diagram Figure 3.
Theorem 3.2**.**
Let be a homomorphism to a finite commutative group . Then there exists a Parikh-reducing Church-Rosser system of finite index which factorizes through . Further, all groups contained in are isomorphic to some subgroup of .
Proof.
Let be the least common multiple of for . We do an inductive proof on the number of letters . If , then we may set . This system is Parikh-reducing, locally confluent and it holds . Thus, we may assume that . Let be the alphabet and be an arbitrary letter of . We consider the alphabet . Inductively, is smaller than , we get a Parikh-reducing Church-Rosser system of finite index which factorizes through . The idea is to first reduce the words over and then work over a new alphabet . Let be the new alphabet of irreducible words over appended by the letter which poses as a separator. We will first construct a Parikh-reducing (over ) Church-Rosser system of finite index. Note that this system is not Parikh-reducing over . We will use two different sets of rules. One for long repetitions of short words and one for longer words which are not repetitions of such short words. Let us first define the set of short words as , that is, as the set of nonempty words of length at most . Let further be
[TABLE]
the system of -rules whereas . The choice of the parameter will be explained later. For now, the fact that is sufficient to obtain that is a Parikh-reducing (over , and thus also over ) Church-Rosser system by Lemma 3.3.
Lemma 3.3** ([DKRW15]).**
Let be a set of nonempty words of length at most which is closed under nontrivial factors, and . Then
[TABLE]
is a subword-reducing Church-Rosser system. In particular, is Parikh-reducing and weight-reducing for every weight.
Next, we will introduce marker words. They basically mark the absence of a long repetition of words in , i.e., a long enough word in will either contain a marker word or a rule in . The next lemma shows that the length of such markers can be bounded by .
Lemma 3.4** ([DKRW15]).**
Let be a set and let . Then is an ideal which is generated by a set of words of length at most , that is, .
Thus, letting , we obtain for some . In order to ensure that we find such a marker which does not start with a , we increase the length of a marker to . Formally, let be the set of markers.
Let be a total preorder on with the following properties:
- •
with and implies .
- •
is a total order on .
- •
implies .
Thus, the larger the block of ’s at the suffix of an , the smaller it is with respect to . Additionally, all elements in with a maximal block of ’s at the suffix are equivalent with respect to . In particular, and implies either or there exists with and . Let for some . We say that is a maximal -factor of , if with implies . We want to show that every long word contains sufficiently large factors which are surrounded by “locally” maximal -factors. The first step is to show the existence of -factors.
Lemma 3.5**.**
There exists a number such that for every word with length at least has a factor for some or a factor .
Proof.
Let . If the statement is true. Thus, we assume that for all there is no factor of . There is a factorization such that is maximal and has no as a prefix.
Hence we obtain and which implies by definition of . As , there is some which does not have as prefix and is a prefix of . Consider the first factor of length of which is not in . Since is a prefix of , one must take at most additional letters left from in order to obtain a factor of which is not in , has length at most and does not start with a . Filling up with letters from the right, we obtain a factor of which is not in , has length and does not start with a , that is, . ∎
Lemma 3.6**.**
There exists a number such that every word of length at least contains either
- •
a factor for or
- •
a factor with , and for every with we have .
Proof.
Let be the set of -factors of and let be defined by the recursion . A quick calculation verifies the explicit formula . We prove the following statement by induction on : For every word of length at least which has at least different -factors, i.e., and which does not contain a factor for , there exists a factor of such that
- •
,
- •
and
- •
is a maximal -factor of .
The case is trivial since by hypothesis every word with length at least and must contain a factor for . Consider the case . Since we require that the length of the factor is smaller or equal to , we consider the prefix of of length . In particular, we can assume that every proper factor of has length smaller than .
Consider the factorization with such that is a maximal -factor of and is maximal with regard to length. If , we obtain
[TABLE]
which implies . Since and contain no factor , we can apply induction to either or . If , then has the form for a word because of and . The factor has the required properties since . This concludes the induction. We infer the statement of the lemma by setting . ∎
In particular, Lemma 3.6 shows the existence of a number such that every with contains a factor with being -maximal for this factor and . The idea is to reduce to a normal form . This is the part where commutativity of is needed. Let be a letter and be the number of occurrences of in . Define and
[TABLE]
The mapping is a normal form in the group , i.e., let be the homomorphism counting the different letters modulo , then if and only if . By choice of we have . Since for and , we obtain
[TABLE]
In particular, and . Additionally, if with , then is Parikh-reducing over since at least the number of decreases. Note that the inequality is actually the reason for the definition of . Let
[TABLE]
be the set of -rules. By definition of the set of -rules is Parikh-reducing over . Note that for a -rule, either and are minimal elements in or . By Lemma 3.6 the system has only finitely many irreducible elements. It remains to prove that is Church-Rosser. By Lemma 3.3 the set of -rules is (locally) confluent. Next, we will study properties of -rules which are crucial for showing that is Church-Rosser. First, we show that -rules preserve -maximal elements.
Lemma 3.7**.**
Let and let be a maximal -factor of . Then for every -factor of .
Proof.
As there are two cases for the rule set of .
In the case that there must exists a and a factorization such that . By construction, we have . Thus, every element of is a factor of if and only if it is also a factor of . Since is -maximal for , it is also -maximal for .
If , there is a factorization such that and are maximal -factors of . Since every marker in has fixed length , it remains to show that has no -factors larger than (and by , also no -factors larger than ). Note that has as prefix and suffix. Every -factor of which is not an -factor of has the form for some and is a suffix of . Since the block of ’s at the suffix of may only increase, we obtain by definition of . Since every element of has length and does not have as a prefix, there is no -factor in which is neither in nor equals . By construction, every -factor of is of the form for some . However, is a minimal element of by construction. In particular, for every -factor of . ∎
Next, as an intermediate step, we show local confluence in the case of a left side of a rule in . In particular, we show that every word of this form can be reduced to a fixed normal form.
Lemma 3.8**.**
Let be a word such that and are maximal -factors of and . Then implies .
Proof.
The statement is clear if which is why we may assume . We show the lemma inductively on the length of . In order to apply the induction step we show that and . The precondition that and are maximal -factors of is satisfied by Lemma 3.7.
In the case of , some rule was applied. As such rules preserve the prefixes and suffixes of length , the word must have the correct form. In the case of , some rule was applied. Since and elements of all have length , the -factors and are preserved by the application of the -rule . In both cases we conclude that for some word .
It remains to show, that . Since , the case of an application of a rule in is trivial. Let stem from the application of a rule . If either or is a factor of , we have that either or is a factor of . Thus, using and for every element , we obtain
[TABLE]
It remains to prove for the situation which is depicted below.
\omega\vphantom{\mu\delta^{t}}$$u\vphantom{\mu\delta^{t}}$$\omega^{\prime}\vphantom{\mu\delta^{t}}$$\mu\vphantom{\mu\delta^{t}}$$u^{\prime}\vphantom{\mu\delta^{t}}$$\mu^{\prime}\vphantom{\mu\delta^{t}}
If , then there exists such that and . However, as no element of starts with the letter , we can conclude and thus by we obtain by the same argument. In this case we have and henceforth . The case that is similar: has no as prefix and thus . Again, and holds. Hence, we may assume and .
Combining both overlaps, we obtain the following picture.
x\vphantom{\mu\delta^{t}}$$\omega\vphantom{\mu\delta^{t}}$$y\vphantom{\mu\delta^{t}}$$\mu\vphantom{\mu\delta^{t}}$$y^{\prime}\vphantom{\mu\delta^{t}}$$x^{\prime}\vphantom{\mu\delta^{t}}$$\mu\vphantom{\mu\delta^{t}}
In the notation of the picture above we have . Thus, and by and it suffices to show . By we have that is a factor of . We conclude which implies . In summary, and holds. If , then we can directly apply the -rule with left side . Else, must be reducible by Lemma 3.6 and we can apply induction. ∎
Combining the previous lemmas we show that is locally confluent.
Lemma 3.9**.**
* is locally confluent.*
Proof.
Let be two rules. We have to show that every overlap of the left sides of those rules resolves. The system is locally confluent by Lemma 3.3. Hence, we may assume that . Let and consequently . Consider first the case that . If is a factor of , that is, if , then by Lemma 3.8. By definition of , the left side which contains an element of cannot be a factor of . Hence, the system resolves in the case of factor critical pairs. Consider thus the case of an overlap critical pair (the case is symmetric). Since is no factor of and by definition, we have the following situation:
\delta^{n}\vphantom{\mu\delta^{t}}$$\delta^{t}\vphantom{\mu\delta^{t}}$$\omega\vphantom{\mu\delta^{t}}$$u\omega^{\prime}\vphantom{\mu\delta^{t}}
Let and be the overlap, then
[TABLE]
Consider the case that and let . Again, if , then by Lemma 3.8. Hence, by symmetry, it suffices to consider the case . If and overlap at most positions,
\mu u^{\prime}$$\mu^{\prime}$$\omega\vphantom{\mu\delta^{t}}$$u\omega^{\prime}\vphantom{\mu\delta^{t}}
then the rules can be applied independently; let again be and be the overlap, then
[TABLE]
and the system resolves in this case.
Hence, we assume that and overlap more than positions. In this case is a factor of and is a factor of . This implies that and are maximal -factors of . We conclude and by Lemma 3.8. ∎
By construction, the system is -invariant and thus the system
[TABLE]
is -invariant. By Lemma 3.6 the system is of finite index over . We can apply Lemma 3.1 and obtain a -invariant Parikh-reducing Church-Rosser system of finite index over . This concludes the proof of the first part of Theorem 3.2. It remains to study the groups in . As an intermediate step, we study the groups in .
Lemma 3.10**.**
Let be a subsemigroup which is a group and identify with the corresponding elements in . Then either there exists some such that is a cyclic group whose order is divisible by or there is an injective homomorphism .
Proof.
Without loss of generality, we may assume that is non-trivial. Let be the identity element of . Note that by definition of the rules and the set , the irreducible word of every word also contains an -factor. Thus, by and for all either all elements in contain some factor in or none of the elements contains an -factor. All words must have length at least by definition of the rules .
Let us first consider the case that none of the elements contain an -factor. We show that there exists some such that for all there exists such that . Let be an application of a rule in and let be a minimal factor of which is not in . By Lemma 3.4 and since , the factor is also a factor of . Thus, the number of factors in does not decrease by an application of a rule in . Consider any . Since the number of factors in does not decrease by some application of a rule in , and no rule in is applicable, we deduce that the number of factors in of and is the same. In particular, this number is zero and we obtain for all . Next, we show that for some . Since and is closed under conjugation, there exists a primitive word and such that for some prefix of . In particular, is a period of . Note that since . Consider the word . By the above, we obtain , that is, again there exists a primitive word , a prefix of and a number such that . Therefore, is a period of and, hence, also of . Since , we may use Theorem 2.1 and conclude that is a period of . Since is primitive, this implies . Since is a prefix of , this yields that is a power of which implies by primitivity of . In particular, is a period of and is a prefix of . Since is primitive this implies that is not a proper prefix of by Lemma 2.2 and we conclude that for every there exists and such that . Thus, consider with primitive words in . Again, is a period of and there must exist a period of . By Theorem 2.1 is a period of . By primitivity of , this yields that is a divisor of . In particular, since is a period of , this yields for some and a prefix of . Using Theorem 2.1 again, we see that is a period of , that is, is a divisor of by primitivity of . By symmetry, this yields and thus .
Fix some primitive word such that . Since for all and the right side of rules in have length at least and since is reducible, we conclude and thus is a subgroup of the cyclic group of order which finishes this case.
The second case is that all words in contain an -factor. Consider the maximal -factors of and factorize with maximal for such that and contains no other maximal -factors of . Since , we conclude that is some normal form. By for all and Lemma 3.8, there must exist a factorization such that is a normal form. In particular, by Lemma 3.8. Consider the homomorphism which counts the number of modulo and the function given by . Note that implies that is a homomorphism. It holds if and only if . By definition of the normal forms , it holds if and only if and therefore is injective. ∎
By Lemma 3.1, we obtain that the subgroups in are isomorphic to subgroups of and . By induction, all groups in are isomorphic to some subgroup of . All groups in are either cyclic of order divisible by or isomorphic to some subgroup of by Lemma 3.10. However, since is defined as the least common multiple of , the cyclic group of order is a subgroup of . This proves the statement. ∎
3.3 Group languages over an alphabet of size two
The same technique as in Subsection 3.2 can be used to obtain Parikh-reducing Church-Rosser systems which factorize through homomorphisms for an arbitrary group . We will only sketch the proof, as it is essentially the proof of Theorem 3.2.
Theorem 3.11**.**
Let be an alphabet of size two and let be a homomorphism into a finite group . Then there exists a Parikh-reducing Church-Rosser system of finite index which factorizes through . All groups in are subgroups of or of where is the exponent of .
Sketch of proof.
Let be the exponent of and let be the set of rules over the alphabet . Set . In the remainder of the sketch, we have to construct a system over . As the set of short words we choose . The corresponding set of rules is for . Note that since the system is confluent by Lemma 3.3.
Let and set . Choose a preorder on such that
- •
with and implies .
- •
is a total order on .
- •
implies .
In order to complete the construction, it remains to choose the normal forms . Note that every representation of needs less than a’s by the pigeonhole principle. Thus, for every there exists a word with and . For every we choose such a word such that the number of ’s is minimal. Note that by construction as a word over . This is the reason for the choice of . Furthermore, , which explains the choice of the parameter . The choice of also yields that there are no -factors in apart from , which is -minimal.
Adapting the proof of Lemma 3.5, we prove the existence of a number such that every word of length at least has a factor for a or a factor . Lemma 3.6 yields the existence of a number such that every contains a factor with being -maximal for this factor and . Again, let
[TABLE]
and . We want to apply Lemma 3.1 to obtain a system . Confluence of follows along the lines of Lemma 3.7, Lemma 3.8 and Lemma 3.9, whereas the statement about the groups in is analogously to Lemma 3.10. ∎
4 Beyond Groups
In this section we apply local divisors in order to lift the construction of Church-Rosser systems for groups to the general case of monoids. Instead of directly constructing a system over , we obtain a system inductively by going over to the local divisor. This decreases the size of the monoid, but increases the size of alphabet. The first part of this theorem has been published in [DKRW15], whereas the second part is based on the use of Rees extensions, see [DKW12, DW16].
Theorem 4.1**.**
Let be a group variety such that for every homomorphism for there exists a Parikh-reducing Church-Rosser system of finite index which factorizes through . Let be a homomorphism with .
There exists a -invariant Parikh-reducing Church-Rosser system of finite index. 2. 2.
If every homomorphism in a group has a Church-Rosser representation in , then .
Proof.
1. We use induction on , ordered lexicographically. Since is closed under taking submonoids, we can restrict ourselves on surjective homomorphisms . If is a group, then and there exists such a system by the preconditions. Thus, we can assume that there is a letter such that is not a unit. Let . By induction the restriction
[TABLE]
admits a Parikh-reducing Church-Rosser system . Consider the set
[TABLE]
This is a prefix code and will be considered as a new alphabet. Let be the homomorphism to the local divisor at induced via . We have and and thus, by induction, there exists a Parikh-reducing Church-Rosser system of finite index, such that factorizes through . In particular, we have for a rule . We show that . For this let and . It holds
[TABLE]
Hence, the rule is -invariant. We set
[TABLE]
The system has the required properties by Lemma 3.1.
2. The statement is clear if is a group. Consequently, the construction above is applied. By induction we may assume that and Lemma 3.1 implies that . ∎
A direct combination of Theorem 3.2 and Theorem 4.1 yields the following corollary.
Corollary 4.2**.**
Let be a monoid and be a homomorphism, then there exists a Parikh-reducing Church-Rosser system such that factorizes through and . In particular, every language recognized by is given as a finite union .
In particular, Theorem 4.1 shows that one can control the groups in the Church-Rosser representation. However, in general one may not preserve other properties, for instance, commutativity.
Proposition 4.3**.**
Let be the homomorphism mapping each letter to the generator of . If , there is no abelian Church-Rosser representation of .
Proof.
Assume that there exists a Church-Rosser system of finite index such that is abelian and there exists a homomorphism with . Let be letters such that . Since factorizes through , we have for every rule and it holds in . Since is abelian, we obtain in . In particular, and must be a group. Let be the order of and . Then holds in and thus there must be a irreducible word with . By the argumentation above, there exists a number such that . Thus, either or which is a contradiction to the definition of . ∎
5 Complexity of Church-Rosser systems
In this section we analyze the size of a Church-Rosser representation as constructed by Theorem 4.1 and Theorem 3.2. We will restrict our analysis on the construction of the Parikh-reducing Church-Rosser representation. Similiar calculations can be made for the analysis of the size of the Church-Rosser system.
Before we prove upper bounds for the size of the constructed Church-Rosser systems, we reconsider the construction. Our constructions used Lemma 3.1 as the basic building block of the construction. Let be a homomorphism. For and a system one needs a system for the alphabet . Now, unlike in the general case, we are able to reduce the alphabet itself by exploiting the structure of the alphabet. Let with and . By the pigeonhole principle there exist such that and . Thus, we may introduce the subword-reducing222subword-reducing seen as a rule over , not over . rule . If is reducible in , reduce it further in . Repeating this process yields a new alphabet for which is a subset of and therefore, if , has at most elements. One can check, that the proofs of Theorem 4.1 and Theorem 3.2 also work adding this reduction technique of the alphabet . We refrained from directly adding it to the theorems, as they are already quite technical.
Proposition 5.1**.**
Let be a homomorphism in , and , then there exists a Parikh-reducing Church-Rosser system such that factorizes through and
[TABLE]
Proof.
Let be the Parikh-reducing Church-Rosser system constructed using Theorem 3.2 and the reduction technique described above. Lemma 3.1 shows that for it holds
[TABLE]
where . In the case of Theorem 3.2, is constructed inductively whereas is constructed directly. By Lemma 3.6, every irreducible word in has length at most and therefore . The construction of in the proof of Lemma 3.6 shows that whereas . Since we obtain
[TABLE]
Using the alphabet reduction technique, we can assume . Note that does not yield another exponential jump. A straightforward calculation yields the existence of a constant such that
[TABLE]
Now let denote the smallest size of a Parikh-reducing Church-Rosser representation of and set
[TABLE]
to be the complexity over all possible homomorphisms with and . We have seen that the recursion
[TABLE]
holds and show inductively using this recursion. Note that and thus the inequality is true in the base case . Also and therefore we assume . For and it holds
[TABLE]
The last inequality holds since
[TABLE]
The triple exponential upper bound given by Proposition 5.1 seems huge, however there is already a single exponential lower bound which is fairly easy to see. The lower bound comes from the fact that Church-Rosser systems cannot directly represent group identities which preserve length, such as commutation.
Proposition 5.2**.**
For every there exists a homomorphism into an abelian group of size such that for every length-reducing Church-Rosser system which factorizes through all words of length smaller than are irreducible, that is, . In particular, if :
[TABLE]
Proof.
Consider the cyclic group of order and the homomorphism which maps all letters to the same generator of . Let be a length-reducing Church-Rosser system which factorizes through . We show that every word of length less than is irreducible in . Let be a word with . Assume that for some word . Since is length-reducing, . However, implies . Since the order of is , this is a contradiction to and must be irreducible. ∎
Note that this proof does not use the Church-Rosser property and thus one could expect a larger size of the Church-Rosser representation.
Example 5.3**.**
Niemann and Waldmann constructed an explicit Parikh-reducing system for the case with for all [NW02, Nie02]. Their system is given by for some arbitrary order on . The irreducible elements in are exactly the sequences which are first strictly increasing and then strictly decreasing, that is
[TABLE]
This yields which is significantly larger than the lower bound given in Proposition 5.2.
In the monoid case, the minimal size of a Church-Rosser representation is bounded by a quadruple exponential function. This increase in complexity, compared to the group case, comes from the fact that, unlike in the group case, the system is constructed by induction. However, this is also the reason that the alphabet reduction technique is even more powerful in this case. Consider the function given by , and for . This function gives an upper bound for the maximal size of a Church-Rosser representation of a monoid of size and an alphabet of size without any optimization. Consider further the hyperoperation function , and .333The notation comes from Ackermann, since the function is a modified Ackermann function. For fixed , the function is primitive recursive, however the two-variable function grows faster than any primitive recursive function, see e.g. [DW83]. An induction shows that for . Hence, without the alphabet reduction the recursive formula would yield a non-primitive recursive function.
Proposition 5.4**.**
Let be a homomorphism in , and . Then there exists a Parikh-reducing Church-Rosser system such that factorizes through and
[TABLE]
Proof 5.5**.**
If , we know that there exists such a system with by Proposition 5.1. If , then there exists a system such that . In the other case we will use the local divisor construction of Theorem 4.1. Note that by the alphabet reduction technique we may assume that .
Let denote the smallest size of a Parikh-reducing Church-Rosser representation of and set
[TABLE]
to be the complexity over all possible homomorphisms with and .
The base cases are or is a group. For there exists a system of size . In all other cases we have the following recursion formula for :
[TABLE]
Note that since is not a group. Choose such that for all base cases. This is possible since the group case is in . We show that
[TABLE]
in general. Inductively, it holds
[TABLE]
The last inequality holds because for
[TABLE]
and thus .
6 Conclusion
In this paper we introduced the notion of Parikh-reducing Church-Rosser representations. We were able to construct such representations in the case of languages in and for group languages over a two-element alphabet. Furthermore, we studied algebraic properties of such representations and the complexity of the corresponding systems. Several questions remain open as future work. Most importantly, does there exist a finite Parikh-reducing Church-Rosser representation for every homomorphism into a finite group? Note that this already implies the case for every finite monoid by Theorem 4.1. Another interesting open question is which algebraic properties can be preserved by Church-Rosser representations. For example, it seems unlikely that every homomorphism into a finite group has a Church-Rosser representation which is a group again, although it may happen in some special cases. Additionally, there is a huge gap between our lower and upper bounds for the complexity. Therefore it is interesting whether there are constructions for Church-Rosser representations which yield a better upper bound and what a good lower bound for the size of a Church-Rosser representation is.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[AK 16] Jorge Almeida and Ondřej Klíma. On the irreducibility of pseudovarieties of semigroups. Journal of Pure and Applied Algebra , 220(4):1517–1524, 2016.
- 2[BO 93] Ron Book and Friedrich Otto. String-Rewriting Systems . Springer-Verlag, 1993.
- 3[DK 15] Volker Diekert and Manfred Kufleitner. A survey on the local divisor technique. Theoretical Computer Science , 610:13–23, 2015.
- 4[DKRW 15] Volker Diekert, Manfred Kufleitner, Klaus Reinhardt, and Tobias Walter. Regular languages are Church-Rosser congruential. J. ACM , 62:39:1–39:20, November 2015.
- 5[DKW 12] Volker Diekert, Manfred Kufleitner, and Pascal Weil. Star-free languages are Church-Rosser congruential. Theoretical Computer Science , 454:129–135, 2012.
- 6[DW 83] Martin D. Davis and Elaine J. Weyuker. Computability, Complexity, and Languages . Academic Press, 1983.
- 7[DW 16] Volker Diekert and Tobias Walter. Characterizing classes of regular languages using prefix codes of bounded synchronization delay. In Ioannis Chatzigiannakis, Michael Mitzenmacher, Yuval Rabani, and Davide Sangiorgi, editors, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016) , Leibniz International Proceedings in Informatics (LIP Ics), pages 129:1—–129:13, 2016.
- 8[Eil 76] Samuel Eilenberg. Automata, Languages, and Machines , volume B. Academic Press, New York and London, 1976.
