Proving that a Tree Language is not First-Order Definable

Martin Beaudry

arXiv:1812.01674·cs.FL·December 6, 2018

Proving that a Tree Language is not First-Order Definable

Martin Beaudry

PDF

Open Access

TL;DR

This paper investigates the algebraic properties of tree languages that are not definable by first-order logic with ancestor predicate, providing recursive proofs of nondefinability and extending algebraic concepts.

Contribution

It introduces a recursive proof method for non-definability of tree languages and extends algebraic notions like aperiodicity within forest algebra frameworks.

Findings

01

Recursive proofs exist for all non-definable languages.

02

Extended algebraic structures generalize existing notions like aperiodicity.

03

A new algebra of mappings aids in analyzing non-definability.

Abstract

We explore from an algebraic viewpoint the properties of the tree languages definable with a first-order formula involving the ancestor predicate, using the description of these languages as those recognized by iterated block products of forest algebras defined from finite counter monoids. Proofs of nondefinability are infinite sequences of sets of forests, one for each level of the hierarchy of quantification levels that defines the corresponding variety of languages. The forests at a given level are built recursively by inserting forests from previous level at the ports of a suitable set of multicontexts. We show that a recursive proof exists for the syntactic algebra of every non-definable language. We also investigate certain types of uniform recursive proofs. For this purpose, we define from a forest algebra an algebra of mappings and an extended algebra, which we also use to…

Equations99

p \equiv_{τ, π} q \Leftrightarrow p = q \lor (p \geq τ \land q \geq τ \land (p - q) \equiv 0 (mod π));

p \equiv_{τ, π} q \Leftrightarrow p = q \lor (p \geq τ \land q \geq τ \land (p - q) \equiv 0 (mod π));

λ (s^{α_{τ, π}^{n}}, x) = ⟨ λ (s, x), α_{τ, π}^{n} (Δ (s, x)), α_{τ, π}^{n} (\nabla (s, x))⟩;

λ (s^{α_{τ, π}^{n}}, x) = ⟨ λ (s, x), α_{τ, π}^{n} (Δ (s, x)), α_{τ, π}^{n} (\nabla (s, x))⟩;

γ [m; h_{1}, \dots, h_{N - 1}] : K \to K by γ [m; h_{1}, \dots, h_{N - 1}] (k) = γ [m] (h_{1}, \dots, h_{N - 1}, k) .

γ [m; h_{1}, \dots, h_{N - 1}] : K \to K by γ [m; h_{1}, \dots, h_{N - 1}] (k) = γ [m] (h_{1}, \dots, h_{N - 1}, k) .

γ [M \underline{\cdot Z} M^{'}; h_{1}, \dots, h_{N - 1}] = γ [M; h_{1}, \dots, h_{N - 1}] \circ γ [M^{'}; h_{1}, \dots, h_{N - 1}]

γ [M \underline{\cdot Z} M^{'}; h_{1}, \dots, h_{N - 1}] = γ [M; h_{1}, \dots, h_{N - 1}] \circ γ [M^{'}; h_{1}, \dots, h_{N - 1}]

and γ [M \underline{\cdot Z} M^{'}] (h_{1}, \dots, h_{N}) = γ [M] (h_{1}, \dots, h_{N - 1}, γ [M^{'}] (h_{1}, \dots, h_{N})) .

and γ [M \underline{\cdot Z} M^{'}] (h_{1}, \dots, h_{N}) = γ [M] (h_{1}, \dots, h_{N - 1}, γ [M^{'}] (h_{1}, \dots, h_{N})) .

α_{τ, π}^{1} [m] (ξ) = α_{τ, π}^{1} (m) + x \in p or t s (m) \sum ξ (ν (x)) = a \in A \sum p_{1} (m)_{a} + b \in B \sum ξ_{1} (b) \cdot p_{1} (m)_{b} .

α_{τ, π}^{1} [m] (ξ) = α_{τ, π}^{1} (m) + x \in p or t s (m) \sum ξ (ν (x)) = a \in A \sum p_{1} (m)_{a} + b \in B \sum ξ_{1} (b) \cdot p_{1} (m)_{b} .

\overset{˘}{ξ} (Δ (m, x)) \approx_{τ, π}^{n - 1} \overset{˘}{ξ} (Δ (m^{'}, x^{'})) and \overset{˘}{ξ} (\nabla (m, x)) \approx_{τ, π}^{n - 1} \overset{˘}{ξ} (\nabla (m^{'}, x^{'})),

\overset{˘}{ξ} (Δ (m, x)) \approx_{τ, π}^{n - 1} \overset{˘}{ξ} (Δ (m^{'}, x^{'})) and \overset{˘}{ξ} (\nabla (m, x)) \approx_{τ, π}^{n - 1} \overset{˘}{ξ} (\nabla (m^{'}, x^{'})),

ζ : H \to H ζ (k) = α [m] (h_{1}, \dots, h_{N - 1}, k) .

ζ : H \to H ζ (k) = α [m] (h_{1}, \dots, h_{N - 1}, k) .

α [m^{(θ, Z)}] (h_{1}, \dots, h_{N}) = ζ^{θ - 1} (g) .

α [m^{(θ, Z)}] (h_{1}, \dots, h_{N}) = ζ^{θ - 1} (g) .

λ (m^{α}, x) = ⟨ λ (m, x), α (Δ (m, x)), α (\nabla^{-} (m, x)) (□ + \underline{\otimes μ (x)} α (Δ^{+} (m, x)))⟩ .

λ (m^{α}, x) = ⟨ λ (m, x), α (Δ (m, x)), α (\nabla^{-} (m, x)) (□ + \underline{\otimes μ (x)} α (Δ^{+} (m, x)))⟩ .

λ ((m \underline{\cdot Z} n)^{α}, x) = ⟨ λ (m, x), α (Δ (m, x) \underline{\cdot Z} n), α (\nabla^{-} (m, x) \underline{\cdot Z} n) (□ + \underline{\otimes μ (x)} α (Δ^{+} (m, x) \underline{\cdot Z} n))⟩

λ ((m \underline{\cdot Z} n)^{α}, x) = ⟨ λ (m, x), α (Δ (m, x) \underline{\cdot Z} n), α (\nabla^{-} (m, x) \underline{\cdot Z} n) (□ + \underline{\otimes μ (x)} α (Δ^{+} (m, x) \underline{\cdot Z} n))⟩

λ ((m \underline{\cdot Z} n)^{α}, y) = ⟨ λ (n, y), α (Δ (n, y)), α (m \underline{\cdot Z} \nabla^{-} (n, y)) (□ + \underline{\otimes μ (y)} α (Δ^{+} (n, y)))⟩,

γ_{%} (m) + γ_{%} (m^{'})

γ_{%} (m) + γ_{%} (m^{'})

α_{%} (m) \subseteq α_{%} (m^{'})

α_{%} (m) \subseteq α_{%} (m^{'})

p_{1} (m)_{b} = c \in C_{b} \sum p_{1} (\overset{˘}{ψ} (m))_{c},

p_{1} (m)_{b} = c \in C_{b} \sum p_{1} (\overset{˘}{ψ} (m))_{c},

α_{%}^{n} (m) = α_{%}^{n} (m^{'})

α_{%}^{n} (m) = α_{%}^{n} (m^{'})

α_{τ, π}^{n - 1} (Δ (\overset{˘}{ψ} (m), x)) = α_{τ, π}^{n - 1} (Δ (\overset{˘}{ψ}^{'} (m^{'}), x^{'})) and α_{τ, π}^{n - 1} (\nabla (\overset{˘}{ψ} (m), x)) = α_{τ, π}^{n - 1} (\nabla (\overset{˘}{ψ}^{'} (m^{'}), x^{'}));

α_{τ, π}^{n - 1} (Δ (\overset{˘}{ψ} (m), x)) = α_{τ, π}^{n - 1} (Δ (\overset{˘}{ψ}^{'} (m^{'}), x^{'})) and α_{τ, π}^{n - 1} (\nabla (\overset{˘}{ψ} (m), x)) = α_{τ, π}^{n - 1} (\nabla (\overset{˘}{ψ}^{'} (m^{'}), x^{'}));

Δ (m, x)) \approx_{τ, π}^{n} Δ (m^{'}, x^{'}) and \nabla (m, x)) \approx_{τ, π}^{n} \nabla (m^{'}, x^{'}),

Δ (m, x)) \approx_{τ, π}^{n} Δ (m^{'}, x^{'}) and \nabla (m, x)) \approx_{τ, π}^{n} \nabla (m^{'}, x^{'}),

p_{1} (m)_{b} = c \in C_{b} \sum p_{1} (\overset{˘}{ψ} (m))_{c} .

p_{1} (m)_{b} = c \in C_{b} \sum p_{1} (\overset{˘}{ψ} (m))_{c} .

{a □ ∣ (φ (a □)) g = g} \cup {□ + t ∣ t \in L_{g}^{pathhead}} \cup h \in G_{X} ⋃ {□ + t ∣ t \in L_{h}},

{a □ ∣ (φ (a □)) g = g} \cup {□ + t ∣ t \in L_{g}^{pathhead}} \cup h \in G_{X} ⋃ {□ + t ∣ t \in L_{h}},

\forall ⟨ h, v, j ⟩ \in H_{τ, π}^{n} \times V_{τ, π}^{n} \times F : P [H_{τ, π}^{n}] (t)_{h, v, j} \equiv_{τ, π} P [H_{τ, π}^{n}] (t^{'})_{h, v, j} .

\forall ⟨ h, v, j ⟩ \in H_{τ, π}^{n} \times V_{τ, π}^{n} \times F : P [H_{τ, π}^{n}] (t)_{h, v, j} \equiv_{τ, π} P [H_{τ, π}^{n}] (t^{'})_{h, v, j} .

t ⟨ ⋆ H_{τ, π}^{n} ⟩ t^{'} \Leftrightarrow \overset{μ}{˘}_{τ, π}^{n} (m) \approx_{τ, π}^{n + 1} \overset{μ}{˘}_{τ, π}^{n} (m^{'}) \land P [H_{τ, π}^{n}] (t) \equiv_{τ, π} P [H_{τ, π}^{n}] (t^{'}) .

t ⟨ ⋆ H_{τ, π}^{n} ⟩ t^{'} \Leftrightarrow \overset{μ}{˘}_{τ, π}^{n} (m) \approx_{τ, π}^{n + 1} \overset{μ}{˘}_{τ, π}^{n} (m^{'}) \land P [H_{τ, π}^{n}] (t) \equiv_{τ, π} P [H_{τ, π}^{n}] (t^{'}) .

α_{τ, π}^{n} (Δ^{+} (s, x)) = α_{τ, π}^{n} (Δ (s, x)) = α_{τ, π}^{n} (s (x)) = μ_{τ, π}^{n} (x)

α_{τ, π}^{n} (Δ^{+} (s, x)) = α_{τ, π}^{n} (Δ (s, x)) = α_{τ, π}^{n} (s (x)) = μ_{τ, π}^{n} (x)

α_{τ, π}^{n} (Δ (s, y)) = α_{τ, π}^{n} (Δ (\overset{μ}{˘}_{τ, π}^{n} (m), y)) and α_{τ, π}^{n} (\nabla (s, y)) = α_{τ, π}^{n} (\nabla (\overset{μ}{˘}_{τ, π}^{n} (m), y))

α_{τ, π}^{n} (Δ (s, y)) = α_{τ, π}^{n} (Δ (\overset{μ}{˘}_{τ, π}^{n} (m), y)) and α_{τ, π}^{n} (\nabla (s, y)) = α_{τ, π}^{n} (\nabla (\overset{μ}{˘}_{τ, π}^{n} (m), y))

α_{τ, π}^{n} (Δ (s, y)) = α_{τ, π}^{n} (Δ (s_{h, j}^{(n)}, y)) = k

α_{τ, π}^{n} (Δ (s, y)) = α_{τ, π}^{n} (Δ (s_{h, j}^{(n)}, y)) = k

α_{τ, π}^{n} (\nabla (s, y)) = α_{τ, π}^{n} (\nabla (s, x) \cdot \nabla (s_{h, j}^{(n)}, y)) = v \cdot α_{τ, π}^{n} (\nabla (s_{h, j}^{(n)}, y)) = v w,

α_{τ, π}^{n} (\nabla (s, y)) = α_{τ, π}^{n} (\nabla (s, x) \cdot \nabla (s_{h, j}^{(n)}, y)) = v \cdot α_{τ, π}^{n} (\nabla (s_{h, j}^{(n)}, y)) = v w,

\overset{μ}{˘}_{τ, π}^{n} (m) \approx_{τ, π}^{n} \overset{μ}{˘}_{τ, π}^{n} (m^{'}) \Rightarrow \overset{μ}{˘}_{τ, π}^{n - 1} (m) \approx_{τ, π}^{n - 1} \overset{μ}{˘}_{τ, π}^{n - 1} (m^{'})

\overset{μ}{˘}_{τ, π}^{n} (m) \approx_{τ, π}^{n} \overset{μ}{˘}_{τ, π}^{n} (m^{'}) \Rightarrow \overset{μ}{˘}_{τ, π}^{n - 1} (m) \approx_{τ, π}^{n - 1} \overset{μ}{˘}_{τ, π}^{n - 1} (m^{'})

P [H_{τ, π}^{n - 1}] (t)_{h, v, j} = σ (k) = h σ (w) = v \sum P [H_{τ, π}^{n}] (t)_{k, w, j} .

P [H_{τ, π}^{n - 1}] (t)_{h, v, j} = σ (k) = h σ (w) = v \sum P [H_{τ, π}^{n}] (t)_{k, w, j} .

H_{Tree} = {h \in H_{τ, π}^{n} : ι (h) \in J \land (α_{τ, π}^{n})^{- 1} (h) \subset T_{A}} .

H_{Tree} = {h \in H_{τ, π}^{n} : ι (h) \in J \land (α_{τ, π}^{n})^{- 1} (h) \subset T_{A}} .

P [J^{σ, ρ}] (t)_{v, J, j} = ∣ {x \in p or t s (m) : β^{σ, ρ} (\nabla (m, x)) = v, ν (x) = J, ψ (x) = j} ∣.

P [J^{σ, ρ}] (t)_{v, J, j} = ∣ {x \in p or t s (m) : β^{σ, ρ} (\nabla (m, x)) = v, ν (x) = J, ψ (x) = j} ∣.

t ⟨ ⋆ J^{σ, ρ} ⟩ t^{'} \Leftrightarrow (m \leftrightarrow σ, ρ m^{'}) \land (\forall v \in U^{σ, ρ}, \forall (J, j) \in D : P [J^{σ, ρ}] (t)_{v, J, j} \equiv_{τ, π} P [J^{σ, ρ}] (t^{'})_{v, J, j}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topicssemigroups and automata theory · Computability, Logic, AI Algorithms · Advanced Algebra and Logic

Full text

Proving that a Tree Language is not First-Order Definable

Martin Beaudry Département d’informatique, Université de Sherbrooke, Sherbrooke (Qc) Canada, J1K 2R1, [email protected]. Work supported by NSERC of Canada.

Abstract

We explore from an algebraic viewpoint the properties of the tree languages definable with a first-order formula involving the ancestor predicate, using the description of these languages as those recognized by iterated block products of forest algebras defined from threshold- $\tau$ , period- $\pi$ counter monoids. Ehrenfeucht-Fraïssé games, i.e. proofs of non-definability, are infinite sequences of sets of forests, one for each level of the hierarchy of quantification levels that defines the corresponding variety of languages. A proof is recursive when the forests at a given level are built by inserting forests from the previous level at the ports of a suitable set of multicontexts. We show that a recursive proof exists for the syntactic algebra of every non-definable language. We also investigate certain types of uniform recursive proofs. For this purpose, we define from a forest algebra an “algebra of mappings” and an “extended algebra”, which we also use to redefine the notion of aperiodicity in a way that generalizes the existing ones.

**Keywords: ** Tree languages, forest languages, monoids, finite algebras, first-order logic.

1 Introduction

Words and trees are used almost universally in Computer Science, and logical formalisms are among the most convenient tools for specifying these objects, or sets thereof. Automata constitute another class of tools, procedural in nature, widely used to define languages; underlying this formalism is a rich algebraic theory, through which further tools from other areas of Mathematics can be used to better understand the properties of word and tree languages. Moreover, the most significant classes of these languages happen to have descriptions in several formalisms. For example, the regular word languages are exactly those that are recognized by finite automata and monoids, and those that are definable by monadic second-order formulas with a unary predicate for each letter and a “left-of” binary positional predicate. Similarly the regular tree languages are simultaneously those that are definable by monadic second-order formulas with two positional predicates (“ancestor-of” and “next-sibling-of”) and those that are recognized by finite tree automata, as well as various sorts of algebraic structures (e.g. finite term algebras, finite forest algebras, finitary preclones).

In the same vein, the word languages definable with first-order logical formulas with the “left-of” predicate have several algebraic and combinatorial descriptions, see [16, 23, 12, 22, 29]. In particular, these languages are precisely those whose syntactic monoid is aperiodic; thanks to this property, whether a regular word language is first-order definable can be determined with a straightforward algorithm. In the world of tree languages, however, none of the definitions for aperiodicity of the syntactic algebra tried so far has managed to characterize precisely the first-order definable languages [11, 4], and the techniques invented to show that certain important subclasses of these languages are indeed decidable [1, 3, 2, 17, 18] did not seem to extend to the whole class.

Forest algebras combine two monoids (horizontal and vertical) in a way which makes it easy for researchers to apply techniques from the theory of monoids and word languages. An encouraging harvest of results has already been obtained with this tool [6, 3, 4]. In this paper, we look at a counterpart, in the world of trees and forests, to the description of the aperiodic monoids as the variety of monoids generated by iterated block products of semilattices [22]. This is a description of the first-order definable languages, developed in terms of the variety of finitary preclones [9] generated by iterated block products of preclones that count occurrences of node labels, regardless of the actual tree structure. Intuitively, a block product works as the combination of two tree automata where at every node $x$ , the second automaton, besides the label of $x$ , also reads the current state reached by the first automaton after reading the subtree rooted at $x$ (“below” $x$ ) and the outcome of the processing, by the first automaton, of the context of $x$ within the tree (“above” $x$ ).

The description of first-order definable languages developed in [10, 13] suggests that the $\equiv_{\tau,\pi}$ threshold- $\tau$ , period- $\pi$ numerical congruences are a fundamental feature in the combinatorics of first-order definable languages. Consistently with this we use the same kind of counting and the corresponding quotient, counter monoids $\mathbb{N}_{\tau,\pi}=\mathbb{N}/{\equiv_{\tau,\pi}}$ (in these notations, the Boolean OR $U_{1}$ and the cyclic group $\mathbb{Z}_{\pi}$ are respectively $\mathbb{N}_{1,1}$ and $\mathbb{N}_{0,\pi}$ ). In our formalism, we denote by $\mathcal{O}\!\!\!\mathcal{D}_{M}$ the one-dimensional algebra where $M$ is the horizontal monoid, by $\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ the variety of forest algebras generated by $\mathcal{O}\!\!\!\mathcal{D}_{\mathbb{N}_{\tau,\pi}}$ , by $\mathbf{**}^{n}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ the variety generated by iterated block products of $n$ algebras from $\mathcal{O}\!\!\!\mathcal{D}_{\mathbb{N}_{\tau,\pi}}$ , and by $\mathbf{**}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ the closure of these varieties over joint. Let $\mathsf{FO}[\prec]$ and $\mathsf{FOMod_{\pi}}[\prec]$ denote both the class of forest languages definable by first-order formulas built with the “ancestor” positional predicate and the usual quantifiers only (for $\mathsf{FO}[\prec]$ ) or the same with the $\exists^{\pi}_{i}$ , $0\leq i<\pi$ , modular quantifiers, and the varieties generated by their syntactic algebras. Using the formalism of finitary preclones they introduced in [8], by Esík and Weil have established in [9] the correspondences $\mathsf{FO}[\prec]=\mathbf{**}\langle\!\langle\mathbb{N}_{1,1}\rangle\!\rangle$ and $\mathsf{FOMod_{\pi}}[\prec]=\mathbf{**}\langle\!\langle\mathbb{N}_{1,\pi}\rangle\!\rangle$ .

We explore two ways of defining from $\mathcal{G}=(G,W)$ another algebra, where multicontexts are the underlying objects. The algebra of mappings $\mathcal{G}_{\#}$ , and the “multivertical monoid” of $\mathcal{G}$ derived from it enable us to define notions of pumping and aperiodicity that generalize two of the known necessary conditions for membership in $\mathsf{FO}[\prec]$ , namely aperiodicity of the vertical monoid and the “absence of vertical confusion on uniform multicontexts” defined in [4]. The extended algebra $\mathcal{G}_{\%}=(G_{\%},W_{\%})$ , where $G_{\%}$ is the powerset of $G$ , makes explicit some properties of $\mathcal{G}$ that are not directly visible in $\mathcal{G}$ or $\mathcal{G}_{\#}$ . An example is described in Section 5.4: this is a language whose syntactic algebra has aperiodic vertical and multivertical monoids, but where $W_{\%}$ is divided by the group $\mathbb{Z}_{2}$ . The language is defined with a $\mathsf{FOMod_{2}}[\prec]$ formula that, among other things, counts the parity of the length of certain node-to-leaf paths; a pair of elements of $W_{\%}$ does precisely this counting.

An algebra $\mathcal{G}$ lies outside of the variety $\mathbf{**}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ if, and only if there exists an infinite sequence of sets $\mathcal{S}^{(n)}$ , one for each $n\geq 1$ , of forests belonging to different languages recognized by $\mathcal{G}$ , such that the elements of $\mathcal{S}^{(n)}$ cannot be told apart by any forest algebra in $\mathbf{**}^{n}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ . This sequence is usually described through an Ehrenfeucht-Fraïssé game. We call such a sequence a proof of non-membership in $\mathbf{**}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ . An Ehrenfeucht-Fraïssé game actually builds a recursive proof, where each forest of $\mathcal{S}^{(n+1)}$ is built by inserting copies of the elements of $\mathcal{S}^{(n)}$ at the ports of the corresponding element of a set $\mathcal{M}^{(n)}$ of multicontexts. Such a proof can be specified with an infinite sequence of such sets $\mathcal{M}^{(n+1)}$ ; we denote by $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ the proposition that states the existence a $\mathcal{M}^{(n+1)}$ that has the required properties; RC stands for “recursive construction”. We prove that $\mathcal{G}\not\in\mathbf{**}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ if, and only if $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ holds for every $n\geq 1$ , that is, every algebra that is not first-order has a recursive proof of non-membership. Next, we observe that in the existing proofs, the circuit $\mathcal{M}^{(n+1)}$ is either identical to $\mathcal{M}^{(n)}$ , in which case each forest of $\mathcal{S}^{(n)}$ is built from copies of the same, finite set of multicontexts (“proof-by-copy”), or $\mathcal{M}^{(n+1)}$ is obtained by pumping a starting set of multicontexts (“proof-by-pumping”). The questions of the existence of a proof-by-copy and of a restricted form of proof-by-pumping are both recursively enumerable.

Section 2 contains background on forests, multicontexts and circuits, and on forest algebras and the varieties ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ . In Section 3, we define the algebras $\mathcal{G}_{\#}$ and $\mathcal{G}_{\%}$ . and the related notions of pumping and aperiodicity. In Section 4, we prove that an algebra is outside ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ if, and only if this can be asserted with a recursive proof; we then explore the notions of proof-by-copy and proof-by-pumping. In Section 5, we discuss in our formalism some typical examples of non-membership proofs. We conclude with some comments and open questions.

2 Definitions and Background

2.1 Forests, Multicontexts, and Circuits

We consistently work with a finite alphabet $A$ , which we assume to always contain a neutral letter $e$ , such that for every forest homomorphism $\gamma$ , $\gamma(e\square)$ is the identity mapping. Let $B$ be another alphabet, disjoint from $A$ . A multicontext $m$ over $(A,B)$ is a sequence of trees in which a subset of the leaves consists of ports, where every non-port node $y$ carries a label $\lambda(m,y)\in A$ . We denote by $nodes(m)$ the set of all nodes in $m$ , by $ports(m)$ the set of its ports and by $interior(m)$ the set of the non-port nodes. We work with multicontext where each port either has a label $\nu(x)\in B$ , or has several labels, each specified as a mapping from $ports(m)$ to a set that is disjoint from $A$ . A forest is a multicontext without ports; a context in the usual sense is a forest with a unique port that carries the special label $\square\not\in(A\cup B)$ , called its $\square$ -port. Throughout the paper, this port is considered apart from the others. Given $x\in nodes(m)$ , we denote by $\Delta(m,x)$ the multicontext consisting of all subtrees of $m$ rooted at the sons of $x$ . The subtree rooted at $x$ , i.e. $\Delta(m,x)$ plus the node $x$ , is denoted $\Delta^{+}(m,x)$ . The context of $x$ within $m$ , with notation $\nabla(m,x)$ , is built from $m$ and $x$ by replacing $\Delta^{+}(m,x)$ with a $\square$ -port. The ancestors of this port constitute the trunk of the context. If we deal with a set of multicontexts $M$ instead of an individual $m$ , we use the notations $nodes(M)$ , $\lambda(M,x)$ , $\Delta(M,x)$ , etc. The sets of all forests, and contexts over $A$ are respectively denoted $\mathbb{H}_{A}$ and $\mathbb{V}_{A}$ . We use the notations $\mathbb{M}_{A,B}$ and $\mathbb{C}_{A,B}$ , respectively, for the set of all multicontexts over $(A,B)$ for the set of all multicontexts with a $\square$ -port (the contexts-in-multicontexts, so to speak). We use the standard representation for individual forests of multicontexts, where nodes are listed in preorder and where concatenation and $+$ represent the father-son relation and horizontal addition, respectively. For example, $a(b+c)$ is a tree with a root labelled $a$ and two sons labelled $b$ and $c$ , while $ab+c$ is a forest of two trees, where nodes $a$ and $c$ are roots, and nodes $b$ and $c$ are leaves.

Inserting $t$ in a context $s$ consists in replacing the $\square$ -port of $s$ with a copy of $t$ ; the resulting forest is denoted $s{\cdot}t$ , or $st$ . Insertion in multicontexts is done here either on a wholesale basis, i.e. something is inserted at every port, or on a selective basis, when insertion occurs at a pre-specified set of ports. The latter method is defined in Section 3; the former is associated with circuits and the construction of witnesses, as follows.

Let $A$ , $B$ and $C$ be three sets. A circuit over $(A,B,C)$ is a set $M$ with an element $m_{c}$ for every $c\in C$ ; this component is a multicontext $m\in\mathbb{M}_{A,B}$ . We can regard $M$ as having an input wire for every $b\in B$ , an output wire for every $c\in C$ , and $m_{c}$ is the result of unraveling into a tree all those nodes of $M$ from which the output wire $c$ is accessible. A set $S$ of forests over $(A,B)$ is defined similarly, with an element $s_{b}\in\mathbb{H}_{A}$ for every $b\in B$ . The insertion of $S$ in a circuit $M$ over $(A,B,C)$ consists in inserting a copy of the forest $s_{\nu(x)}$ at every $x\in ports(M)$ ; the result is a set of forests over $(A,C)$ , denoted $M{\cdot}S$ . If $M$ and $M^{\prime}$ are circuits over $(A,B,B)$ then inserting $M^{\prime}$ in $M$ builds a circuit $M{\cdot}M^{\prime}$ over $(A,B,B)$ . It can be verified, using standard methods, that this operation is associative.

2.2 Forest algebras

The reader is assumed to be knowledgeable with the notions of semigroups and monoids, and their relations with regular languages, word congruences and monoid homomorphisms (see [14, 16]). Two types of notations are used for the monoids discussed in the article. There is an additive, or “horizontal” notation where the identity and operation are denoted [math] and $+$ , respectively, although this does in no way imply that the latter is commutative. In the multiplicative, or “vertical” notation, the neutral element is denoted $\varepsilon$ and the operation is written with $\cdot$ or by concatenation of the arguments.

A * transformation * of a set $S$ is a mapping $S\rightarrow S$ , i.e. an element of the monoid $S^{S}$ . A * translation * in a monoid $M$ (with the additive notation) is a mapping $[u+\varepsilon+v]:M\rightarrow M$ , where $u,v\in M$ , defined by $s\mapsto u+s+v$ . If $M$ is commutative, then the translations are of the form $[u+\varepsilon]$ . The set $\mathcal{M}(M)=\{[u+\varepsilon+v]\ |\ u,v\in M\}$ with the composition of functions is the translation monoid of $M$ .

Definition 2.1

A forest algebra is a pair $\mathcal{H}=(H,V)$ where $(H,+)$ is a monoid and $(V,\cdot)$ is a submonoid of $H^{H}$ which contains $\mathcal{M}(H)$ .

Monoids $H$ and $V$ are the horizontal and vertical monoids of $\mathcal{H}$ , respectively. Because $V$ is a submonoid of $H^{H}$ , its action on $H$ is faithful. Forest algebras were introduced in [6] as pairs of abstract monoids; in that case, faithfulness has to be specified in the definition.

A forest algebra homomorphism from $\mathcal{H}=(H,V)$ to $\mathcal{G}=(G,W)$ is a pair of mappings $\alpha=(\alpha_{H},\alpha_{V})$ where $\alpha_{H}$ and $\alpha_{V}$ are monoid homomorphisms $H\rightarrow G$ and $V\rightarrow W$ , respectively, and such that $\alpha_{H}\circ w=\alpha_{V}(w)\circ\alpha_{H}$ for every $w\in V$ . The free forest algebra over $A$ is $\mathbb{F}_{A}=(\mathbb{H}_{A},\mathbb{V}_{A})$ ; since it is generated from $\{a\square\ |\ a\in A\}$ , a homomorphism $\alpha:\mathbb{F}_{A}\rightarrow\mathcal{H}$ is completely specified once $\mathcal{H}$ and every $\alpha_{V}(a\square)$ , $a\in A$ , are known. A forest congruence in $\mathbb{F}_{A}$ is a pair of equivalence relations, both denoted by $\approx$ , such that in $\mathbb{H}_{A}$ $s\approx t$ iff $ps\approx pt$ for every context $p$ , and in $\mathbb{V}_{A}$ $p\approx q$ iff $pt\approx qt$ for every forest $t$ . A congruence $\approx$ refines another congruence $\simeq$ over the same domain, when $x\approx x^{\prime}\Rightarrow x\simeq x^{\prime}$ for all $x,x^{\prime}$ . A homomorphism $\alpha$ defines its nuclear congruence: $s\approx_{\alpha}t\Leftrightarrow\alpha(s)=\alpha(t)$ , and conversely a congruence $\approx$ defines a homomorphism from $\mathbb{F}_{A}$ to $\mathbb{F}_{A}/\approx$ . A set $L\subset\mathbb{H}_{A}$ is recognized by $\mathcal{H}$ if there exist a homomorphism $\alpha:\mathbb{F}_{A}\rightarrow\mathcal{H}$ and a subset $F\subset H$ such that for all $t\in\mathbb{H}_{A}$ , $t\in L\Leftrightarrow\alpha(t)\in F$ . A context language $K\subset\mathbb{V}_{A}$ is recognized in the same way, with an accepting set $P\subset V$ . The syntactic congruences of these languages are refined by $\approx_{\alpha}$ .

A variety of forest algebras is a class of finite forest algebras closed under finite direct product and division. Given forest algebras $(G,W)$ and $(H,V)$ , we say that $\mathcal{G}$ is a subalgebra of $\mathcal{H}$ iff $G\subseteq H$ and $W\subseteq V$ , and that it divides $\mathcal{H}$ , with notation $\mathcal{G}\prec\mathcal{H}$ , if it is the homomorphic image of a subalgebra of $\mathcal{H}$ . A variety of forest languages is formally defined as a mapping $\mathsf{W}$ such that, for every alphabet $A$ , $\mathsf{W}(A)$ is closed under finite boolean operations, inverse homomorphism of free algebras and context quotients. With $L$ a language and $p\in\mathbb{V}_{A}$ , the context quotient of $L$ by $p$ is the set $p^{-1}L=\{\ s\ |\ ps\in L\}$ ; a forest algebra which recognizes $L$ also recognizes $p^{-1}L$ . The lattices of varieties of forest algebras and of varieties of forest languages are isomorphic [5].

Let $\mathcal{G}=(G,W)$ be a forest algebra and let $g,h\in G$ . An element $h$ is accessible from $g$ when $g=wh$ for some $w\in W$ . A set is strongly connected when its elements are mutually accessible; a strongly connected component of $G$ is a subset that is maximal for this property. Let $K$ be such a set: we define from it the set $W^{-1}K=\{g\in G:\exists\,w\in W,\,wg\in K\}$ of all elements from which $K$ is accessible, and its complement $I(K)$ , which is an ideal, that is, a subset of $G$ closed under the action of every element of $W$ . Let $K\subset G$ . The leaf-completion of a multicontext $m$ through a mapping $\chi:ports(m)\rightarrow K$ is the forest $\breve{\chi}(m)\in\mathbb{H}_{A\cup K}$ , obtained by labeling every port $x$ with $\chi(x)$ . Consistently with this, the leaf-extension of a homomorphism $\gamma:\mathbb{F}_{A}\rightarrow\mathcal{G}$ to $\mathbb{H}_{A\cup K}$ is built by defining $\gamma(t_{k})=k$ , for every $h\in K$ and one-node tree $t_{k}$ with label $k$ . Then $\gamma\circ\breve{\chi}(m)$ is the image by $\gamma$ of the leaf-completion of $m$ through $\chi$ .

2.3 Block product congruences

It is known that the equivalence relation over $\mathbb{F}_{A}$ where every class consists in all forests that model the same set of formulas of quantifier depth $n$ , is a forest congruence. A generalized version of this congruence is $\approx^{n}_{\tau,\pi}$ , where $n,\tau,\pi\geq 1$ are integers, defined as follows:

$\bullet$

it is built around the threshold- $\tau$ , period- $\pi$ counting congruence over $\mathbb{N}$ , defined by

[TABLE]

the quotient monoid $\mathbb{N}/{\equiv_{\tau,\pi}}$ is denoted $\mathbb{N}_{\tau,\pi}$ ;

$\bullet$

given $s,t\in\mathbb{H}_{A}$ , we have $s\approx^{1}_{\tau,\pi}t$ if, and only if, for every $a\in A$ , the number of nodes with label $a$ in $s$ and in $t$ are congruent under $\equiv_{\tau,\pi}$ ; the quotient algebra $\mathbb{F}_{A}/{\approx^{1}_{\tau,\pi}}$ is denoted $\mathcal{H}^{1}_{\tau,\pi}$ ; the corresponding surjective homomorphism is $\alpha^{1}_{\tau,\pi}:\mathbb{F}_{A}\rightarrow\mathcal{H}^{1}_{\tau,\pi}$ ;

$\bullet$

for $n\geq 1$ , given that $\approx^{n}_{\tau,\pi}$ and the quotient algebra $\mathcal{H}^{n}_{\tau,\pi}=(H^{n}_{\tau,\pi},V^{n}_{\tau,\pi})$ are already known, we define a relabeling operation $s\mapsto s^{\alpha^{n}_{\tau,\pi}}$ which consists in replacing, at every node $x$ of $s$ , the label $\lambda(s,x)\in A$ with the triple

[TABLE]

this defines the relabeling alphabet $D^{\alpha^{n}_{\tau,\pi}}=A\times H^{n}_{\tau,\pi},\times V^{n}_{\tau,\pi}$ ; the same is done in a context $t\in\mathbb{V}_{A}$ ; however, the new label of $x$ is different depending on whether $x$ is on the trunk, so that $t^{\alpha^{n}_{\tau,\pi}}$ is a context over $E^{\alpha^{n}_{\tau,\pi}}=(A_{\text{off}}\cup A_{\text{trunk}})\times H^{n}_{\tau,\pi},\times V^{n}_{\tau,\pi}$ , where $A_{\text{off}}$ and $A_{\text{trunk}}$ are disjoint copies of $A$ ;

$\bullet$

for $n\geq 1$ and $s,t\in\mathbb{H}_{A}$ , we have $s\approx^{n+1}_{\tau,\pi}t$ if, and only if $s\approx^{n}_{\tau,\pi}t$ and $s^{\alpha^{n}_{\tau,\pi}}\approx^{1}_{\tau,\pi}t^{\alpha^{n}_{\tau,\pi}}$ .

Example: we have $(a+\square)\approx^{1}_{\tau,\pi}a\square$ and $(a+\square)\not\approx^{2}_{\tau,\pi}a\square$ , which illustrates the distinction between trunk and off-trunk nodes.

The quotient algebra $\mathbb{F}_{A}/{\approx^{1}_{\tau,\pi}}$ is isomorphic to the one-dimensional algebra $\mathcal{O}\!\!\!\mathcal{D}_{\mathbb{N}_{\tau,\pi}^{|A|}}$ . A one-dimensional forest algebra111Also called flat algebras in previous works on the topic: the homomorphic image of a forest is the image in a monoid of a “flattened”, “one-dimensional” version of the forest. The wording is also a reference to the notion that an algebra $\mathcal{O}\!\!\!\mathcal{D}_{M}{\;{\scriptstyle{\square}}\;}\mathcal{O}\!\!\!\mathcal{D}_{N}$ recognizes forest languages that are “more two-dimensional” than those recognized by $\mathcal{O}\!\!\!\mathcal{D}_{M}$ . is a pair $(M,\mathcal{M}(M))$ such that for every homomorphism $\gamma:\mathbb{F}_{A}\rightarrow(M,\mathcal{M}(M))$ and every $a\in A$ , there exists $m\in M$ such that $\gamma(a\square)=[m+\varepsilon]$ . In such an algebra, the homomorphic image of a forest $t\in\mathbb{H}_{A}$ is independent of its structure, that is, the algebra only considers the string $\eta(t)$ of its node labels, given in a predetermined order (e.g. in preorder). Therefore, $(M,\mathcal{M}(M))$ associates to $\gamma$ a monoid homomorphism $\hat{\gamma}:A^{*}\rightarrow M$ , such that $\gamma_{H}=\hat{\gamma}\circ\eta$ . We denote by $\mathcal{O}\!\!\!\mathcal{D}_{M}$ the (unique) one-dimensional algebra built from $M$ .

The congruences $\approx^{n}_{\tau,\pi}$ can be defined algebraically, as follows. Let $\langle\mathbb{N}_{\tau,\pi}\rangle$ and ${{\mathbf{**}}^{1}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ denote respectively the variety of monoids generated by $\mathbb{N}_{\tau,\pi}$ and the variety of forest algebras generated by the algebras $\mathcal{O}\!\!\!\mathcal{D}_{M}$ where $M\in\langle\mathbb{N}_{\tau,\pi}\rangle$ . Then for every language $L\subseteq\mathbb{H}_{A}$ , its syntactic forest algebra $\mathcal{G}(L)$ belongs to ${{\mathbf{**}}^{1}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ if, and only if $\approx^{1}_{\tau,\pi}$ refines its syntactic congruence, or equivalently, iff $\mathcal{G}(L)$ divides $\mathcal{H}^{1}_{\tau,\pi}$ . Next, every algebra $\mathcal{H}^{n+1}_{\tau,\pi}=\mathbb{H}_{A}/{\approx^{n+1}_{\tau,\pi}}$ is a block product $\mathcal{H}^{n}_{\tau,\pi}\;{\scriptstyle{\square}}\;\mathcal{O}\!\!\!\mathcal{D}_{\mathbb{N}_{\tau,\pi}^{N}}$ , with $N\geq|D^{\alpha^{n}_{\tau,\pi}}\cup E^{\alpha^{n}_{\tau,\pi}}|$ . We use ${{\mathbf{**}}^{n+1}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ to denote the variety generated by block products of the form $\mathcal{G}\;{\scriptstyle{\square}}\;\mathcal{O}\!\!\!\mathcal{D}_{\mathbb{N}_{\tau,\pi}^{N}}$ with $\mathcal{G}\in{{\mathbf{**}}^{n}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ . Finally, ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle=\bigvee_{n\geq 1}{{\mathbf{**}}^{n}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ . We will make abundant use of the following.

Proposition 2.2

The following statements on a finite-index congruence $\simeq$ over $\mathbb{F}_{A}$ are equivalent: $\mathbb{F}_{A}/{\simeq}\in{{\mathbf{**}}^{n}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ ; $\mathbb{F}_{A}/{\simeq}\prec\mathcal{H}^{n}_{\tau,\pi}$ ; the congruence $\approx^{n}_{\tau,\pi}$ refines $\simeq$ . $\square$

Let $\mathsf{FO}[\prec]$ denote the variety of all forest languages definable with first-order logic formulas with the $\forall$ and $\exists$ quantifiers and the ‘ancestor’ positional predicate; for $q\geq 2$ , let $\mathsf{FOMod_{q}}[\prec]$ denote the variety defined in terms of the same sort of formulas, where now the $\exists^{q}_{i}$ , $0\leq i<q$ , modular quantifiers are also allowed. It was proved in [9] that the syntactic preclones of the languages in $\mathsf{FO}[\prec]$ generate the same variety as the iterated block products of preclones defined in terms of counting under threshold one (i.e. the monoid $U_{1}$ ); adding to the generating preclones those defined with counting under the congruence $\equiv_{0,\pi}$ yields a characterization for $\mathsf{FOMod_{\pi}}[\prec]$ . It can be verified that these equivalences translate into $\mathsf{FO}[\prec]=\bigvee_{\tau\geq 1}{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,1}\rangle\!\rangle$ and $\mathsf{FOMod_{\pi}}[\prec]=\bigvee_{\tau\geq 1}{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ .

Remark. Actually, $\mathsf{FO}[\prec]={{\mathbf{**}}}\langle\!\langle\mathbb{N}_{1,1}\rangle\!\rangle$ , where $\mathbb{N}_{1,1}=U_{1}$ is the Boolean OR monoid, and similarly $\mathsf{FOMod_{\pi}}[\prec]={{\mathbf{**}}}\langle\!\langle\mathbb{N}_{1,\pi}\rangle\!\rangle$ , so that working in terms of nontrivial thresholds $\tau$ is not mandatory. However, doing so makes it possible to follow more closely the counting-under-threshold that seems to be inherent to the construction of proofs of non-membership in $\mathsf{FO}[\prec]$ , and is reminiscent to the description of the first-order definable forest languages developed in [10, 13]. Note that a characterization $\mathsf{Mod_{\pi}}[\prec]={{\mathbf{**}}}\langle\!\langle\mathbb{Z}_{\pi}\rangle\!\rangle$ also exists, where $\mathbb{Z}_{\pi}$ is the cyclic group of order $\pi$ ; we put aside this special case in the current version of this paper.

3 Algebras for Multicontexts

Forest algebras were designed as tools to handle trees, forests, and contexts over $A$ . Dealing with multicontexts over $(A,B)$ as we do in this article demands that a suitable algebraic structure be developed to describe how a forest algebra works on them. A first approach consists, given a forest algebra $\mathcal{K}=(K,U)$ , in regarding a multicontext as a specification for a multivariate mapping from $K^{B}$ to $K$ . This defines the algebra of mappings $\mathcal{K}_{\#}$ ; it is used to define the notion of pumping, which underlies the construction of certain Ehrenfeucht-Fraïssé games, and to associate to $\mathcal{K}$ a threshold and a period that are consistent with those used so far in the literature. A second approach consists in considering that a port label specifies elements of $K$ are allowed as inputs at that port. This leads to the definition of the extended algebra $\mathcal{K}_{\%}$ , which we use to generalize once more the notions of threshold, period, and aperiodicity. Necessary conditions for first-order definability, that supersede some of the existing ones, are defined from the latter.

3.1 Multicontexts

We use both $m$ and $(m,\nu)$ to denote the pair consisting of a multicontext $m$ , where every interior node carries a label $\lambda(y)\in A$ , and a port labeling $\nu(x)\in B$ . When this pair is equipped with a second port labeling $\psi$ , we denote the resulting tuple $t=(m,\nu,\psi)$ when $\psi$ is fixed and the emphasis is on $t$ as a whole, and $\breve{\psi}(m)$ when it is understood that $(m,\nu)$ is fixed and $\psi$ is one of several possible second port labelings. Next, instead of labeling a port directly with a horizontal monoid element, as it was done in the previous section, we take $\nu(x)$ and $\psi$ in sets $B$ and $C$ , respectively, such that $A$ , $B$ and $C$ are pairwise disjoint; when dealing with specific algebras, leaf extensions of the appropriate homomorphisms are then defined on $B$ and $C$ . Note that we are ultimately interested in the recognition of languages over $A$ , so that $B$ and $C$ are artefacts used in this process and the ultimate results should not depend on them. The tuples over $A$ and $B\times C$ , along with the contexts defined from them by replacing a leaf with a special port $\square$ , constitute a forest algebra $\mathbb{F}_{A,B\times C}=(\mathbb{M}_{A,B\times C},\mathbb{C}_{A,B\times C})$ ; those over $A$ and $B$ constitute $\mathbb{F}_{A,B}=(\mathbb{M}_{A,B},\mathbb{C}_{A,B})$ ; the reader can verify that both are free algebras.

Besides the insertion in a context, i.e. the monoid operations in $\mathbb{V}_{A}$ , $\mathbb{C}_{A,B}$ and $\mathbb{C}_{A,B\times C}$ , we define an operation that does multiple, simultaneous insertions in a multicontext from $\mathbb{M}_{A,B}$ . Given sets of multicontexts $M_{1}$ and $M_{2}$ and $Z\subset ports(M_{1})$ , with $Z\cap ports(m_{1})\neq\emptyset$ for every $m_{1}\in M_{1}$ , we denote by $M_{1}\,\underline{{\cdot}{\scriptstyle{Z}}}\,M_{2}$ the set of all multicontexts $m_{\text{new}}$ that can be built by taking an element $m_{1}\in M_{1}$ , inserting at each port $x\in Z\cap ports(m_{1})$ a multicontext $m(x)\in M_{2}$ and replacing the label of $x$ with the neutral letter $e$ ; with this new label, $x$ has no effect on the image by a homomorphism of $m_{\text{new}}$ , while it remains available to be used in reasonings and proofs. No other label is modified, so that in particular if $m(x)$ is a copy of $m_{2}\in M_{2}$ and $y\in nodes(m_{2})$ , then the counterpart $y^{\prime}$ of $y$ in $m(x)$ satisfies $\lambda(m_{\text{new}},y^{\prime})=\lambda(m(x),y)=\lambda(m_{2},y)$ .

Let $M\subset\mathbb{M}_{A,B}$ and let $Z\subseteq ports(M)$ . Then with $m\in M$ , we use the notations $Z(m)$ and $Z(M)$ for $Z\cap ports(m)$ and $Z\cap ports(M)$ , respectively. Given a congruence $\cong$ , we say that $Z$ is $\cong$ -stable when every pair $x,x^{\prime}$ of ports satisfies $\nabla(M,x)\cong\nabla(M,x^{\prime})$ .

Next, let $B^{\prime}\subseteq B$ and let $Z$ be the set of all ports with label in $B^{\prime}$ . With $\theta\geq 1$ we define the set $M^{(\theta,Z)}$ obtained by pumping $\theta$ times the set $M$ at $Z$ ,by: $M^{(1,Z)}=M$ and $M^{(\theta+1,Z)}=M^{(\theta,Z)}\,\underline{{\cdot}{\scriptstyle{Z}}}\,M$ . This definition of pumping is consistent with the definition of the vertical monoid of $\mathbb{F}_{A}/{\cong}$ (where both $B$ and $Z$ are singletons), with the “vertical confusion” defined in [4] (where $B$ and $M$ are singletons), and with the “vertical confusion on uniform multicontexts” also discussed in [4] (where $B$ and $M$ are singletons and the ports of $Z$ are indistinguishable by any congruence).

3.2 The algebra of mappings

Let $\mathcal{K}=(K,U)$ be a finite algebra and $\gamma:\mathbb{F}_{A}\rightarrow\mathcal{K}$ a surjective homomorphism. We look for a reasonable way of extending $\mathcal{K}$ to $\mathbb{F}_{A,B}$ , besides the one that consists in defining a leaf extension of $\gamma$ to $B$ . With this in mind, we define the algebra of mappings of $\mathcal{K}$ , which we denote $\mathcal{K}_{\#}$ . To do so, we show how to translate a congruence in $\mathbb{F}_{A,B}$ into a congruence in $\mathbb{F}_{A}$ , and vice versa. Define a mapping $\hat{\varepsilon}_{K}$ from $\mathbb{M}_{A,B}$ to $\mathbb{H}_{A}$ , that turns $m\in\mathbb{M}_{A,B}$ into a forest $\hat{\varepsilon}_{K}(m)$ by replacing all port labels with the neutral letter $e$ ; define $\hat{\varepsilon}_{V}$ from $\mathbb{C}_{A,B}$ to $\mathbb{V}_{A}$ in exactly the same way; both mappings constitute a surjective homomorphism from $\mathbb{F}_{A,B}$ to $\mathbb{F}_{A}$ . Thus, given a forest congruence $\approx$ over $\mathbb{F}_{A,B}$ , a forest congruence is defined in a natural way over $\mathbb{F}_{A}$ . In the other direction, let $B=\{b_{1},\ldots,b_{N}\}$ be finite and let $\xi$ be simultaneously regarded as a vector $\xi\in K^{N}$ a mapping $\xi:B\rightarrow K$ . Given $m\in\mathbb{M}_{A,B}$ , define $\mu:ports(m)\rightarrow K$ by $\mu=\xi\circ\nu$ . From $m$ and $\xi$ , we build a forest $\breve{\mu}(m,\xi)$ over $A\cup K$ by replacing in $m$ every port label $\nu(x)$ with $\xi(x)$ . We build $\breve{\mu}(w,\xi)$ from $w\in\mathbb{C}_{A,B}$ in the same way; the $\square$ -port of $w$ retains its label. We extend $\gamma$ to $\mathbb{H}_{A\cup K}$ by defining $\gamma(h)=h$ for every $h\in K$ , so that $\xi$ in fact is one of the $|K|^{N}$ leaf extensions for $\gamma$ that can be built on $B$ , and define a mapping $\gamma[m]:K^{N}\rightarrow K$ by $\gamma[m](\xi)=\gamma(\breve{\mu}(m,\xi))$ ; similarly, we define $\gamma[w]:K^{N}\rightarrow U$ by $\gamma[w](\xi)=\gamma(\breve{\mu}(w,\xi))$ . Next, we define $K_{\#}$ and $U_{\#}$ , the sets of all mappings $K^{N}\rightarrow K$ and $K^{N}\rightarrow U$ , respectively, and given $f,f^{\prime}\in K_{\#}$ and $u,u^{\prime}\in U_{\#}$ , the operations and vertical action $f+f^{\prime}:\xi\mapsto f(\xi)+f^{\prime}(\xi)$ , $uu^{\prime}:\xi\mapsto u(\xi)u^{\prime}(\xi)$ , and $uf:\xi\mapsto u(\xi)f(\xi)$ , so that the pair $\mathcal{K}_{\#}=(K_{\#},U_{\#})$ constitutes a forest algebra222A notation that mentions $B$ , e.g. $\mathcal{K}_{\sharp B}$ , would actually be more accurate, if more cumbersome.

and $\gamma:\mathbb{F}_{A,B}\rightarrow\mathcal{K}_{\#}$ defined by $m\mapsto\gamma[m]$ and $w\mapsto\gamma[w]$ is a homomorphism. Let $\approx$ be the nuclear congruence of $\gamma$ : we define from it an equivalence between nodes, also denoted $\approx$ . Let $x,x^{\prime}\in nodes(M)$ where $M$ is a set of multicontexts closed under $\approx$ :

$x\approx x^{\prime}$ iff $\lambda(x)=\lambda(x^{\prime})$ and $\Delta(M,x)\approx\Delta(M,x^{\prime})$ and $\nabla(M,x)\approx\nabla(M,x^{\prime})$ ;

nodes equivalent under this relation “cannot be told apart by $\mathcal{K}_{\#}$ ”. We also write $(m,x)\approx(m^{\prime},x^{\prime})$ and $(M,x)\approx(M,x^{\prime})$ in order to specify where the nodes are located. Given $m\in M$ and $h_{1},\ldots,h_{N-1}\in H$ we define the mapping

[TABLE]

Since $M$ is closed under $\approx$ , $\alpha[m]$ is the same for every $m\in M$ and we can use the notation $\gamma[M]$ . Then with $Z=\{x\in ports(m):\nu(x)=b_{N}\}$ , we observe

[TABLE]

Therefore, every operation $\underline{{\cdot}{\scriptstyle{Z}}}$ satisfies the compatibility property for $\approx$ [7, Definition 5.1].

Proposition 3.1

Let $\mathcal{H}^{n}_{\#}$ and $\alpha^{n}_{\#}$ be built from $\mathcal{H}^{n}_{\tau,\pi}=\mathbb{F}_{A}/\approx^{n}_{\tau,\pi}$ and $\alpha^{n}_{\tau,\pi}$ . Then the nuclear congruence of $\alpha^{n}_{\#}$ is refined by the congruence $\approx^{n}_{\tau,\pi}$ built recursively over $\mathbb{F}_{A,B}$ , as in Section 2.3. Hence, $\mathcal{H}^{n}_{\#}\prec\mathbb{F}_{A,B}/\approx^{n}_{\tau,\pi}$ .

Proof. By induction on $n$ . Recall that a given $\xi$ is regarded both as a mapping $\xi:B\rightarrow H^{n}_{\tau,\pi}$ and as a vector $\xi\in(H^{n}_{\tau,\pi})^{|B|}$ . For the $n=1$ case, we associate to every $m\in\mathbb{M}_{A,B}$ a vector $p_{1}(m)$ with component in $\mathbb{N}_{\tau,\pi}$ and labels in $A\cup B$ , where $p_{1}(m)_{a}$ , $a\in A$ , and $p_{1}(m)_{b}$ , $b\in B$ , are respectively the number of nodes $y$ with label $\lambda(y)=a$ and the number of ports $x$ with $\nu(x)=b$ . The algebra $H^{1}_{\tau,\pi}=\mathbb{F}_{A}/\approx^{1}_{\tau,\pi}$ is isomorphic to $(\mathbb{N}_{\tau,\pi})^{|A|}$ , so that, with some abuse of notations, we can write $\alpha^{1}_{\tau,\pi}(m)=p(m)$ , and given a mapping $\xi:B\rightarrow H^{1}_{\tau,\pi}$ , the image $\xi(b)$ of $b\in B$ can be represented as a vector $\xi_{1}(b)\in(\mathbb{N}_{\tau,\pi})^{|A|}$ . Within $\mathbb{M}_{A,B}$ , there is an equivalence class under $\approx^{1}_{\tau,\pi}$ for every possible value of $p_{1}(m)$ , i.e. every vector in $(\mathbb{N}_{\tau,\pi})^{|A\cup B|}$ . Then given $m\in\mathbb{M}_{A,B}$ ,

[TABLE]

From there, if $p_{1}(m)=p_{1}(m^{\prime})$ , then $\alpha^{1}_{\tau,\pi}[m]=\alpha^{1}_{\tau,\pi}[m^{\prime}]$ . With $n\geq 2$ , the induction hypothesis states that if $m\approx^{n-1}_{\tau,\pi}m^{\prime}$ , then for every mapping $\xi:B\rightarrow H^{n-1}_{\tau,\pi}$ , the leaf completions of $m$ and $m^{\prime}$ through $\xi$ satisfy $\breve{\xi}(m)\approx^{n-1}_{\tau,\pi}\breve{\xi}(m^{\prime})$ . Assume that $m\approx^{n}_{\tau,\pi}m^{\prime}$ . Two nodes or ports $x$ of $m$ and $x^{\prime}$ of $m^{\prime}$ receive the same label in the versions of $m$ and $m^{\prime}$ relabeled according to $\alpha^{n-1}_{\tau,\pi}$ iff $\lambda(m,x)=\lambda(m^{\prime},x^{\prime})$ , $\Delta(m,x)\approx^{n-1}_{\tau,\pi}\Delta(m^{\prime},x^{\prime})$ and $\nabla(m,x)\approx^{n-1}_{\tau,\pi}\nabla(m^{\prime},x^{\prime})$ . By the induction hypothesis, the last two items imply, for every $\xi$ :

[TABLE]

which means that $x$ and $x^{\prime}$ receive the same label in the versions of $\breve{\xi}(m)$ and $\breve{\xi}(m^{\prime})$ relabeled according to $\alpha^{n-1}_{\tau,\pi}$ , that is, $\breve{\xi}(m)\approx^{n}_{\tau,\pi}\breve{\xi}(m^{\prime})$ .

$\Box$

The algebras $\mathcal{H}^{n}_{\#}$ and $\mathbb{F}_{A,B}/\approx^{n}_{\tau,\pi}$ are not isomorphic, however. To see this, let $\tau=2$ and $\pi=1$ , so that $\mathbb{N}_{2,1}=\{0,1,\infty\}$ , $n=1$ and $A=\{a\}$ , so that $\mathbb{F}_{A}/\approx^{1}_{2,1}$ is isomorphic to $\mathcal{O}\!\!\!\mathcal{D}_{\mathbb{N}_{2,1}}$ , where, and finally $B=\{b\}$ . With $m=aab$ and $m^{\prime}=aa(b+b)$ , we have $\alpha^{1}_{2,1}(m)\neq\alpha^{1}_{2,1}(m^{\prime})$ , while $\alpha^{1}_{2,1}[m]=\alpha^{1}_{2,1}[m^{\prime}]$ is the constant function that map $\{0,1,\infty\}$ onto $\infty$ .

3.3 Equivalence under pumping

We use the algebra of mappings to define a “threshold $\tau$ , period $\pi$ equivalence under pumping” congruence $\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}=\bigcup_{i\geq 0}\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}^{(i)}$ within $\mathbb{F}_{A,B}$ . First, let $\stackrel{{\scriptstyle\infty}}{{\leftrightarrow}}$ denote the relation where two forests are equivalent iff they are the same up to horizontal permutations within a sum (we might as well use $=$ instead of $\stackrel{{\scriptstyle\infty}}{{\leftrightarrow}}$ ). Then we consider a special case of a multicontext $m\in\mathbb{M}_{A,B}$ where any two ports $x,x^{\prime}$ satisfy $\nu(x)=\nu(x^{\prime})\Leftrightarrow\nabla(m,x)\stackrel{{\scriptstyle\infty}}{{\leftrightarrow}}\nabla(m,x^{\prime})$ , that is, their contexts within $m$ are indistinguishable. Then the $\stackrel{{\scriptstyle\infty}}{{\leftrightarrow}}$ -stable sets of ports are exactly the sets $\nu^{-1}(b)$ , $b\in B$ ; we say that $m$ is suitable for pumping. Pumping333Note that this formalism also covers the case where pumping is done “horizontally”, i.e. where we are dealing with a multicontext of the form $m=c+x_{1}+\cdots+x_{k}$ and where $Z=\{x_{1},\ldots,x_{k}\}$ .

the singleton $\{m\}$ along a $\stackrel{{\scriptstyle\infty}}{{\leftrightarrow}}$ -stable set of ports $Z$ , we obtain for each $\theta\in\mathbb{N}$ a singleton $\{m\}^{(\theta,Z)}=\{m^{(\theta,Z)}\}$ . We now define $\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}^{(i)}$ , for every $i\geq 0$ . First, $\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}^{(0)}$ coincides with $\stackrel{{\scriptstyle\infty}}{{\leftrightarrow}}$ . Next, the forest congruence $\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}^{(1)}$ is generated by the pairs $(m^{(\theta,Z)},m^{(\theta^{\prime},Z)})$ and the corresponding context congruence by the pairs $(\nabla(m^{(\theta,Z)},x),\nabla(m^{(\theta^{\prime},Z)},x^{\prime}))$ , where $\theta\equiv_{\tau,\pi}\theta^{\prime}$ , $x\in Z(m^{(\theta,Z)})$ and $x^{\prime}\in Z(m^{(\theta^{\prime},Z)})$ . Then recursively for $i\geq 1$ , given a set $M$ of multicontexts closed under $\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}^{(i)}$ and a set $Z\subset ports(M)$ that is $\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}^{(i)}$ -stable, $\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}^{(i+1)}$ is the congruence generated by the pairs $(m,m^{\prime})$ and $(\nabla(m,x),\nabla(m^{\prime},x^{\prime}))$ where $\theta\equiv_{\tau,\pi}\theta^{\prime}$ , $m\in M^{(\theta,Z)}$ , $m^{\prime}\in M^{(\theta^{\prime},Z)}$ , $x\in Z(m)$ and $x^{\prime}\in Z(m^{\prime})$ . We denote by $\mathcal{J}^{\tau,\pi}=(J^{\tau,\pi},U^{\tau,\pi})$ the (infinite) quotient algebra $\mathbb{F}_{A,B}/{\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}}$ and by $\beta^{\tau,\pi}$ the corresponding surjective homomorphism.

Proposition 3.2

Every congruence of finite index over $\mathbb{F}_{A}$ is refined by a congruence $\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}$ .

Proof. Let $\mathcal{H}=(H,V)=\mathbb{F}_{A}/{\approx}=\alpha(\mathbb{F}_{A})$ be finite. Let $B=\{b_{1},\ldots,b_{N}\}$ and let $m\in\mathbb{M}_{A,B}$ be suitable for pumping. Assume that $Z=\nu^{-1}(b_{N})$ ; we pump the singleton $\{m\}$ along $Z$ . Given $h_{1},\ldots,h_{N}\in H$ we define $g=\alpha[m](h_{1},\ldots,h_{N})$ and the mapping

[TABLE]

Observe that $\alpha[m^{(2,Z)}](h_{1},\ldots,h_{N})=\zeta(g)$ and in general,

[TABLE]

The mapping $\zeta$ generates a subsemigroup $\langle\zeta\rangle$ of $H^{H}$ ; from the threshold and period of $\langle\zeta\rangle$ we obtain integers $\tau$ and $\pi$ such that, for all combination of $m$ and $Z$ , we have $\alpha[m^{(\theta,Z)}]=\alpha[m^{(\theta^{\prime},Z)}]$ as soon as $\theta\equiv_{\tau,\pi}\theta^{\prime}$ .

We prove by induction on $i$ , that with these $\tau$ and $\pi$ , for every $\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}^{(i)}$ -closed set $M$ and every $\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}^{(i)}$ -stable $Z\subseteq ports(M)$ , every combination of multicontexts $\hat{m}\in M^{(\theta,Z)}$ and $\hat{m}^{\prime}\in M^{(\theta^{\prime},Z)}$ satisfies $\alpha[\hat{m}]=\alpha[\hat{m}^{\prime}]$ , whenever $\theta\equiv_{\tau,\pi}\theta^{\prime}$ . At $i=0$ the relation $\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}^{(0)}$ coincides with $\stackrel{{\scriptstyle\infty}}{{\leftrightarrow}}$ . If $M$ is $\stackrel{{\scriptstyle\infty}}{{\leftrightarrow}}$ -closed, then every pair $m,m^{\prime}\in M$ satisfies $\alpha[{m}]=\alpha[{m}^{\prime}]$ . Now let $Z\subseteq ports(M)$ be $\stackrel{{\scriptstyle\infty}}{{\leftrightarrow}}$ -stable. For every pair $\hat{m},\hat{m}^{\prime}\in M^{(\theta,Z)}$ and any $\theta\geq 2$ , we have $\alpha[\hat{m}]=\alpha[\hat{m}^{\prime}]$ , in particular when $\hat{m}=m^{(\theta,Z)}$ , which means $\alpha[\hat{m}^{\prime}]=\alpha[m^{(\theta,Z)}]$ for all $\hat{m}^{\prime}\in M^{(\theta,Z)}$ . Since $M^{(\theta,Z)}\cup M^{(\theta^{\prime},Z)}$ is $\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}^{(1)}$ -closed whenever $\theta\equiv_{\tau,\pi}\theta^{\prime}$ , every $\hat{m}^{\prime}\in M^{(\theta,Z)}\cup M^{(\theta^{\prime},Z)}$ satisfies simultaneously $\hat{m}^{\prime}\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}^{(1)}m^{(\theta,Z)}$ and $\alpha[\hat{m}^{\prime}]=\alpha[m^{(\theta,Z)}]$ . From this we can deduce that the proposition holds for $\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}^{(1)}$ . The induction step $i\geq 1$ is proved in the same way.

$\Box$

Following [4], we say that multicontext $m$ is uniform when $ports(m)$ is $\stackrel{{\scriptstyle\infty}}{{\leftrightarrow}}$ -stable, i.e all nodes at every depth level have the same label and the same number of sons. We denote by ${}^{\underline{m}}\mathbb{V}_{A}$ the set of all uniform multicontexts over $A$ ; it can be seen as a subset of $\mathbb{M}_{A,\{b\}}$ where $b\not\in A$ . Define a symbol $\underline{{\otimes}n}$ for each $n\in\mathbb{N}$ , and let $\underline{{\otimes}\mathbb{N}}=\{\underline{{\otimes}n}:n\in\mathbb{N}\}$ ; regarding $\mathbb{V}_{A}$ as another alphabet, we can see a uniform multicontext as the image of a word of $\underline{{\otimes}\mathbb{N}}(\mathbb{V}_{A}\underline{{\otimes}\mathbb{N}})^{*}$ by a mapping $\zeta$ such that $\zeta(v)=v$ for every $v\in\mathbb{V}_{A}$ , $\zeta(\underline{{\otimes}1}w)=\zeta(w)$ , $\zeta(\underline{{\otimes}2}w)=\zeta(w)+\zeta(w)$ , etc. with notation $\zeta(\underline{{\otimes}n}w)=\underline{{\otimes}n}\zeta(w)$ , and $\zeta(ww^{\prime})=\zeta(w)\,\underline{{\cdot}{\scriptstyle{Z}}}\,\zeta(w^{\prime})$ , where $Z=ports(\zeta(w))$ . It is a standard exercise to verify that ${}^{\underline{m}}\mathbb{V}_{A}$ is a monoid of transformations of $\mathbb{H}_{A}$ and that $\zeta$ is a homomorphism. We then associate to $\mathcal{H}=(H,V)=\alpha(\mathbb{F}_{A})$ its multivertical monoid, defined as ${}^{\underline{m}}V=\alpha({}^{\underline{m}}\mathbb{V}_{A})$ .

The proof of Proposition 3.2 shows that the smallest integers $\tau$ and $\pi$ such that $\stackrel{{\scriptstyle\tau,\pi}}{{\leftrightarrow}}$ refines $\approx$ are the threshold and period of the multivertical monoid ${}^{\underline{m}}V$ . Also, the algorithm described in [4] to decide whether an algebra has vertical confusion on uniform multicontexts can be adapted to compute the values of $\tau$ and $\pi$ . We generalize the condition tested by this algorithm to ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ and the variety of monoids $\mathbf{Sol}_{\tau,\pi}$ , generated by iterated block products of elements of $\langle\mathbb{N}_{\tau,\pi}\rangle$ , which means a generalization from the aperiodic to the solvable monoids,

Proposition 3.3

If an algebra $\mathcal{G}=(G,W)$ belongs to ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ , then its multivertical monoid belongs to $\mathbf{Sol}_{\tau,\pi}$ .

Proof. It suffices to prove that the multivertical monoid of a block product of forest algebras is a block product of their multivertical monoids. Let $m\in{}^{\underline{m}}\mathbb{V}_{A}$ and let $x\in nodes(m)$ ; the subtree $s=\Delta^{+}(m,x)$ rooted at $x$ belongs to a sum of $1$ or more copies of $s$ ; let $\mu(x)$ be the number of copies of $s$ in this sum, other than $\Delta^{+}(m,x)$ itself. Let $\nabla^{-}(m,x)$ be the context in which this sum is inserted, so that $\nabla(m,x)=\nabla^{-}(m,x)(\square+\underline{{\otimes}\mu(x)}s)$ . Let $\alpha:\mathbb{F}_{A}\rightarrow\mathcal{H}$ be a surjective homomorphism: the label of $x$ in the version $m^{\alpha}$ of $m$ relabeled according to $\alpha$ is

[TABLE]

Given $n\in{}^{\underline{m}}\mathbb{V}_{A}$ and $y\in nodes(n)$ , we have

[TABLE]

and define “above” and ”below” actions, such that $\lambda((m\,\underline{{\cdot}{\scriptstyle{Z}}}\,n)^{\alpha},x)=\lambda(m^{\alpha},x).\alpha(n)$ and $\lambda((m\,\underline{{\cdot}{\scriptstyle{Z}}}\,n)^{\alpha},y)=\alpha(m).\lambda(n^{\alpha},y)$ . The fact that the operation $\underline{{\cdot}{\scriptstyle{Z}}}$ is associative is used to verify that these actions are monoidal; this ensures that we indeed have a block product. $\Box$

3.4 The extended algebra

Let $\mathcal{K}=(K,U)$ be a forest algebra and let $\gamma:\mathbb{F}_{A}\rightarrow\mathcal{K}$ be surjective. Given $(m,\nu)\in\mathbb{M}_{A,B}$ , we build all forests $s$ that can be obtained by insertion of elements of $\mathbb{H}_{A}$ at the ports of $(m,\nu)$ and gather their images images $\gamma(s)$ in a set $\gamma_{\%}(m)$ . This is done under a restriction: we define a mapping $\gamma_{\%}:B\rightarrow\mathcal{P}(K)$ and demand that the forest $t(x)$ inserted at $x$ satisfy $\gamma(t)\in\gamma_{\%}(\nu(x))$ . To simplify the reasonings and definitions, and avoid having to deal with irrelevant special cases, we assume that $\gamma_{\%}^{-1}(F)\neq\emptyset$ for every nonempty $F\subseteq K$ , and that there exists a partition $C=\bigcup_{b\in B}C_{b}$ where $|C_{b}|=|\gamma_{\%}(b)|$ for every $b\in B$ , as well as a bijective leaf extension $\gamma:C_{b}\rightarrow\gamma_{\%}(b)$ . We define the notation $D=\bigcup_{b\in B}\{b\}\times C_{b}$ , so that we can write that we work on $\mathbb{F}_{A,D}$ rather than on $\mathbb{F}_{A,B\times C}$ , and say that the $\gamma_{\%}$ defined above is the the universal leaf extension of $\gamma$ to $D$ . We restrict our work to labelings $\psi:ports(m)\rightarrow C$ that are consistent with $\nu$ , in the sense that $\psi(x)\in C_{\nu(x)}$ for every port $x$ . Given $(m,\nu)\in\mathbb{M}_{A,B}$ , we denote by $\Psi(m,\nu)=\Psi(m)$ the set of all mappings $\psi$ consistent with $\nu$ , and given $\psi\in\Psi(m)$ , we use the notation $\breve{\psi}(m,\nu)$ for the multicontext $(m,\nu,\psi)\in\mathbb{M}_{A,D}$ .

At some point we will look at more than one algebra at once, e.g. $\mathcal{G}$ and $\mathcal{K}$ : then we will wlog assume that the universal extensions $\varphi_{\%}$ and $\gamma_{\%}$ are defined in such a way that $(\varphi_{\%}^{-1}(F)\cap\gamma_{\%}^{-1}(F))\neq\emptyset$ for every nonempty $F\subseteq G\times K$ .

From the universal extension of $\gamma$ we define the mapping $\gamma_{\%}:\mathbb{M}_{A,D}\rightarrow\mathcal{P}(K)$ by $\gamma_{\%}(m,\nu)=\{\gamma(\breve{\psi}(m,\nu)):\psi\in\Psi(m,\nu)\}$ . This is a homomorphism: we verify this with the horizontal addition, leaving the rest of the proof to the reader.

[TABLE]

The image of $\mathbb{F}_{A,D}$ by $\gamma_{\%}$ is the extended algebra of $\mathcal{K}$ , with the notation $\mathcal{K}_{\%}=(K_{\%},U_{\%})$ . In the special case of ${\mathcal{H}}{}^{n}_{\tau,\pi}$ , since $\tau,\pi$ are fixed parameters most of the time, we use the notations $\mathcal{H}^{n}_{\%}$ and $\alpha^{n}_{\%}$ instead of ${\mathcal{H}}{}^{n}_{\tau,\pi}{}_{\%}$ and ${\alpha}{}^{n}_{\tau,\pi}{}_{\%}$ .

Proposition 3.4

Let $\mathcal{H}$ and $\mathcal{K}$ be finite forest algebras. If $\mathcal{K}\prec\mathcal{H}$ , then $\mathcal{K}_{\%}\prec\mathcal{H}_{\%}$ .

Proof. Let $\gamma:\mathbb{F}_{A}\rightarrow\mathcal{K}$ and $\alpha:\mathbb{F}_{A}\rightarrow\mathcal{H}$ be be surjective homomorphisms. Assume that for all $t,t^{\prime}\in\mathbb{M}_{A,D}$ , $\alpha(t)=\alpha(t^{\prime})$ implies $\gamma(t)=\gamma(t^{\prime})$ . Then with $m,m^{\prime}\in\mathbb{M}_{A,B}$ , we have

[TABLE]

Inclusion in the other direction is proved in the same way, so that if $\alpha_{\%}(m)=\alpha_{\%}(m^{\prime})$ , then $\gamma_{\%}(m)=\gamma_{\%}(m^{\prime})$ . $\Box$

Proposition 3.5

For every $m,m^{\prime}\in\mathbb{M}_{A,B}$ and $n\geq 1$ , if $\alpha^{n}_{\%}(m)=\alpha^{n}_{\%}(m^{\prime})$ , then $m\approx^{n}_{\tau,\pi}m^{\prime}$ .

Proof. By induction on $n$ . For the $n=1$ case, as in the proofs of Propositions 3.1, we associate to every $m\in\mathbb{M}_{A,B}$ a vector $p_{1}(m)$ with component labels in $A\cup B$ , where $p_{1}(m)_{a}$ , $a\in A$ , and $p_{1}(m)_{b}$ , $b\in B$ , are respectively the number of nodes $y$ with label $\lambda(y)=a$ and the number of ports $x$ with $\nu(x)=b$ ; we define for $t\in\mathbb{M}_{A,D}$ a vector $p_{1}(t)$ in the same way, with component labels in $A\cup C$ . Let $m\in\mathbb{M}_{A,B}$ : its image $\alpha^{1}_{\%}(m)=\{{\alpha}^{1}_{\tau,\pi}(\breve{\psi}(m)):\psi\in\Psi(m)\}$ is determined by the set of all vectors $p_{1}(\breve{\psi}(m))$ that can be obtained from $p_{1}(m)$ . While $p_{1}(m)_{a}=p_{1}(\breve{\psi}(m))_{a}$ for every $a\in A$ , given $b\in B$ we have

[TABLE]

so that for any $\psi^{\prime}\in\Psi(m^{\prime})$ , $p_{1}(\breve{\psi}(m))\equiv_{\tau,\pi}p_{1}(\breve{\psi}^{\prime}(m^{\prime}))$ implies $p_{1}(m)\equiv_{\tau,\pi}p_{1}(m^{\prime})$ , from which we conclude $\alpha^{1}_{\%}(m)=\alpha^{1}_{\%}(m^{\prime})\Rightarrow m\approx^{1}_{\tau,\pi}m^{\prime}$ .

Let $n\geq 2$ : the induction hypothesis states that if ${\alpha}^{n-1}_{\tau,\pi}(\breve{\psi}(m))={\alpha}^{n-1}_{\tau,\pi}(\breve{\psi}^{\prime}(m^{\prime}))$ for some $\psi\in\Psi(m)$ and $\psi^{\prime}\in\Psi(m^{\prime})$ , then $m\approx^{n-1}_{\tau,\pi}m^{\prime}$ . Let $p_{n}(\breve{\psi}(m))$ denote the vectors with a component for every element of the appropriate relabeling alphabet, which counts the number of its occurrences in the version of $\breve{\psi}(m)$ relabeled according to $\alpha^{n-1}_{\tau,\pi}$ . Then

[TABLE]

Two nodes $x$ and $x^{\prime}$ in $\breve{\psi}(m)$ and $\breve{\psi}^{\prime}(m^{\prime})$ carry the same element of the relabeling alphabet only if they carry the same label in $m$ and $m^{\prime}$ , and

[TABLE]

by the induction hypothesis, this implies

[TABLE]

hence $x$ and $x^{\prime}$ carry with the same label in the relabeled versions of $m$ and $m^{\prime}$ . Therefore, $\alpha^{n}_{\%}(m)=\alpha^{n}_{\%}(m^{\prime})\Rightarrow m\approx^{n}_{\tau,\pi}m^{\prime}$ . $\Box$

Proposition 3.6

For every $n\geq 1$ , $\mathcal{H}^{n}_{\%}\in{{\mathbf{**}}}\langle\mathbb{N}_{\tau,\pi}\rangle$ .

Proof. By induction on the structure of the horizontal monoid of $\mathcal{H}^{n}_{\%}$ ; the induction base is done on $\mathcal{H}^{1}_{\%}$ , however.

As in the proofs of Propositions 3.1 and 3.5, we associate to every $m\in\mathbb{M}_{A,B}$ a vector $p_{1}(m)$ , where it was seen that for every $b\in B$ we have

[TABLE]

Then $\alpha^{1}_{\%}(m)=\alpha^{1}_{\%}(m^{\prime})$ when $p_{1}(m)_{a}\equiv_{\tau,\pi}p_{1}(m^{\prime})_{a}$ for every $a\in A$ and $p_{1}(m)_{b}\equiv_{\sigma,\pi}p_{1}(m^{\prime})_{b}$ for every $b\in B$ , where $\sigma\geq\tau{\cdot}|K|$ , so that having at least $\sigma$ ports with $\nu(x)=b$ ensures that there is enough room for $\tau$ occurrences of $\psi(x)=c$ , for every $c\in C_{b}$ . Therefore, $\mathcal{H}^{1}_{\%}\in{{\mathbf{**}}^{1}}\langle\!\langle\mathbb{N}_{\sigma,\pi}\rangle\!\rangle$ .

At the induction step, let $I$ be the minimal ideal of the horizontal monoid $H^{n}_{\tau,\pi}$ . We assume the existence of $q\geq n$ such that for every $h\not\in I$ , the language $L_{h}=\{m\in\mathbb{M}_{A,B}:h\in\alpha^{n}_{\%}(m)\}$ is recognized by an algebra $\mathcal{K}\in{{\mathbf{**}}^{q}}\langle\!\langle\mathbb{N}_{\sigma,\pi}\rangle\!\rangle$ . Let $m\in\mathbb{M}_{A,B}$ for which we know that $\alpha^{n}_{\%}(m)\cap I\neq\emptyset$ , that is, ${\alpha}^{n}_{\tau,\pi}(\breve{\psi}(m))\in I$ for at least one $\psi\in\Psi(m)$ . We say that a node $y$ of $m$ is a pathhead for $\mathcal{H}^{n}_{\%}$ when $\alpha^{n}_{\%}(\Delta^{+}(m,y))\cap I\neq\emptyset$ and $\alpha^{n}_{\%}(\Delta^{+}(m,z))\subseteq H^{n}_{\tau,\pi}\setminus I$ for every son $z$ of $y$ . We build from $m$ a multicontext $\overline{m}$ whose interior nodes are the pathheads and their ancestors, and where a port $x$ is created along every edge $(y,z)$ such that $y$ is an interior node of $\overline{m}$ and its son $z$ is not, so that $\Delta(m,x)=\Delta^{+}(m,z)$ . Given $k\in I$ and a mapping $\chi:ports(\overline{m})\rightarrow H^{n}_{\tau,\pi}\setminus I$ , the algebra $\mathcal{H}^{n}_{\tau,\pi}$ can recognize whether the resulting forest $\breve{\chi}(\overline{m})$ belongs to the set $({\alpha}^{n}_{\tau,\pi})^{-1}(k)$ . Next, for every pair $z,z^{\prime}\in ports(\overline{m})$ and $h,h^{\prime}\not\in I$ , whether $h\in{\alpha}^{n}_{\%}(\Delta(m,z))$ is determined by $\mathcal{K}$ , and the same holds for the question of whether $h^{\prime}\in{\alpha}^{n}_{\%}(\Delta(m,z^{\prime}))$ . The two hold simultaneously iff there exist labelings $\psi\in\Psi(\Delta(m,z))$ and $\psi^{\prime}\in\Psi(\Delta(m,z^{\prime}))$ that ensure $\alpha^{n}_{\tau,\pi}\circ\breve{\psi}(\Delta(m,z))=h$ and $\alpha^{n}_{\tau,\pi}\circ\breve{\psi}^{\prime}(\Delta(m,z^{\prime}))=h^{\prime}$ . Since the sets $ports(\Delta(m,z))$ and $ports(\Delta(m,z^{\prime}))$ are disjoint, the hypotheses on $\psi$ and $\psi^{\prime}$ are independent, and can be reworded as the existence of a suitable labeling in $\Psi(\Delta(m,z)+\Delta(m,z^{\prime}))$ .

From the induction hypothesis, whether a node is a pathhead is recognized by an algebra of the form $\mathcal{K}\;{\scriptstyle{\square}}\;\mathcal{O}\!\!\!\mathcal{D}_{M}$ , with $M\in\langle\mathbb{N}_{\sigma,\pi}\rangle$ ; a further block product $(\mathcal{H}^{n}_{\tau,\pi}\times(\mathcal{K}\;{\scriptstyle{\square}}\;\mathcal{O}\!\!\!\mathcal{D}_{M}))\;{\scriptstyle{\square}}\;\mathcal{O}\!\!\!\mathcal{D}_{M^{\prime}}$ , with $M^{\prime}\in\langle\mathbb{N}_{\sigma,\pi}\rangle$ , recognizes whether $k\in\alpha^{n}_{\%}(m)$ , where $k\in I$ . $\Box$

Proposition 3.7

If $\mathcal{G}\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ , then $\mathcal{G}_{\%}\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ and ${}^{\underline{m}}W_{\%}\in\mathbf{Sol}_{\tau,\pi}$ .

Proof. This is a consequence of Propositions 3.3, 3.4, and 3.6.

$\Box$

Let $\sigma$ and $\rho$ be the smallest integers such that $\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}$ refines the canonical congruence of $\mathcal{G}_{\%}$ : given Proposition 3.7, it makes sense to call them the threshold and period of $\mathcal{G}_{\%}$ . In the special case of ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,1}\rangle\!\rangle$ , the condition ${}^{\underline{m}}W_{\%}\in\mathbf{A}$ supersedes both the aperiodicity of $W$ and the absence of vertical confusion on uniform multicontexts. As mentioned earlier, Potthoff’s algebra, described in Section 5.4, satisfies these two conditions while its extended algebra has a non-aperiodic vertical monoid.

4 Recursive Proofs

In this section, we consistently use the following notations. The algebra $\mathcal{G}=(G,W)$ is the one whose membership the variety ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ is to be decided. We associate to $\mathcal{G}$ a surjective homomorphism $\varphi:\mathbb{F}_{A}\rightarrow\mathcal{G}$ with nuclear congruence $\simeq_{G}$ . Given $n\geq 1$ and the variety ${{\mathbf{**}}}^{n}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ , let $\approx^{n}_{\tau,\pi}$ , $\mathcal{H}^{n}_{\tau,\pi}=\mathbb{F}_{A}/{\approx^{n}_{\tau,\pi}}$ and $\alpha^{n}_{\tau,\pi}:\mathbb{F}_{A}\rightarrow\mathcal{H}^{n}_{\tau,\pi}$ , such that $\approx^{n}_{\tau,\pi}$ is the finest congruence over $\mathbb{F}_{A}$ that satisfies $\mathcal{H}^{n}_{\tau,\pi}\in{{\mathbf{**}}}^{n}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ and $\alpha^{n}_{\tau,\pi}$ is a surjective homomorphism. We use the notation ${{\mathbf{**}}}^{n}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ both for the variety of algebras and for the variety of the forest or context languages that they recognize.

4.1 $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight sets

It is already folklore that non-membership in $\mathsf{FO}[\prec]$ of an algebra $\mathcal{G}=(G,W)$ ultimately has to do with mutually accessible elements of its horizontal monoid, and to $\mathcal{R}$ -classes of $W$ . The following preliminaries confirm this, and introduce some definitions and facts that are used later.

For every $g\in G$ we define the language $L_{g}=\varphi^{-1}(g)$ and, for $K\subset G$ , $L_{K}=\bigcup_{k\in K}L_{k}$ . It is a basic fact that for any alphabet $A$ of size at least $2$ and variety of algebras $\mathbf{W}$ , $\mathcal{G}\not\in\mathbf{W}$ iff for every surjective homomorphism $\varphi:\mathbb{F}_{A}\rightarrow\mathcal{G}$ , there exists an element $g\in G$ such that $L_{g}\not\in\mathsf{W}$ . Therefore, $\mathcal{G}\not\in\ {{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ implies the existence of a partition of $G$ into $G_{\text{out}}=\{g\in G:L_{g}\not\in\ {{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle\}$ and its complement $G_{\text{in}}$ . Since we always work with $\tau\geq 1$ , both $L_{0}=\varphi^{-1}(0)$ and the set $\mathbb{T}_{A}$ of all trees over $A$ belong to ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ , so that $G_{\text{in}}\neq\emptyset$ . Given $g\in G_{\text{out}}$ , we define $L^{\mathsf{min}}_{g}$ as the subset of $L_{g}\cap\mathbb{T}_{A}$ where $\varphi(\Delta^{+}(y))\not\in G_{\text{out}}$ for every non-root node. A pathhead for $g$ is a node $x$ such that $\Delta^{+}(s,x)\in L^{\mathsf{min}}_{g}$ ; there exists an algebra in ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ that can recognize whether a node of $s$ is a pathhead for $g$ . We say that $g\in G_{\text{out}}$ is minimal when at least one forest in $\mathbb{H}_{A}$ contains a pathhead for $g$ . A subset of $G_{\text{out}}$ is minimal if it contains at least one minimal element. There exists therefore an algebra in ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ which recognizes every language $L_{h}$ , as well as every $L^{\mathsf{tree}}_{h}=L_{h}\cap\mathbb{T}_{A}$ , $h\in G_{\text{in}}$ and every $L^{\mathsf{min}}_{g}$ for $g\in G_{\text{out}}$ minimal. We now show that in every $\mathcal{G}\not\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ the set $G_{\text{out}}$ has at least one minimal strongly connected subset.

Proposition 4.1

In $\mathcal{G}\not\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ , at least two elements of $G_{\text{out}}$ are mutually accessible.

Proof. Assume the existence of $g\in G_{\text{out}}$ such that $g\leq h$ does not hold for any $h\in G_{\text{out}}$ , $h\neq g$ . This implies that $g\leq g+h$ does not hold either, so that in particular, either $g+g=g$ or $g+g<g$ . We assume that $g+g=g$ ; the other case is dealt with in a similar, and simpler, manner. Consider a forest $t=s_{1}+\cdots+s_{n}$ with each $s_{i}\in\mathbb{T}_{A}$ , and let $t\in L_{g}$ ; there are two possible cases for $t$ . First, none of the $s_{i}$ ’s belongs to $L^{\mathsf{tree}}_{g}$ and whether $t\in L_{g}$ is determined by counting, for each $h\in G_{\text{in}}$ , the number of trees in $t$ that belong to $L^{\mathsf{tree}}_{h}$ ; this is done by a block product $\mathcal{H}_{\text{in}}\;{\scriptstyle{\square}}\;\mathcal{O}\!\!\!\mathcal{D}_{M}$ with $M\in\langle\mathbb{N}_{\tau,\pi}\rangle$ . In the second case, at least one tree in $t$ satisfies $s_{i}\in L^{\mathsf{tree}}_{g}$ . This means satisfying two conditions: that $s_{i}\in L_{g}^{\mathsf{pathhead}}-\bigcup_{k\neq g}L_{k}^{\mathsf{pathhead}}$ , where $L_{g}^{\mathsf{pathhead}}$ is the set of all trees that contain a pathhead for $g$ ; that the context within $s_{i}$ of every pathhead maps $g$ to itself, which makes it belong to the subset of $\mathbb{V}_{A}$ generated by

[TABLE]

where $G_{X}=\{h\in G_{\text{in}}\;|\;g+h=g\}$ . Each set mentioned here is recognizable by an algebra in ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ . Finally, a forest with at least one tree in $L^{\mathsf{tree}}_{g}$ and no tree in $L^{\mathsf{tree}}_{k}$ for any $k\in G_{\text{out}}$ will belong to $L_{g}$ iff everyone of its other trees belongs to $\bigcup_{h\in G_{X}}L^{\mathsf{tree}}_{h}$ .

$\Box$

Given $J\subset G$ , we say that a set of forests $S$ is diagonal for $J$ if $\varphi$ works as a bijection from $S$ to $J$ , i.e. we can write $S=\{s_{j}:j\in J\}$ with $\varphi(s_{j})=j$ for every $j\in J$ . If $\mathcal{G}\not\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ , then there exists for every $n\geq 1$ a non-singleton $J^{(n)}\subseteq G$ and a set of forests $S^{(n)}=\{s_{j}^{(n)}:j\in J^{(n)}\}$ , closed under $\approx_{\tau,\pi}^{n}$ -closed and diagonal for $J^{(n)}$ , which can serve as witnesses for $\approx_{\tau,\pi}^{n}$ . At least one $J\subseteq G$ occurs infinitely often in the sequence, and since $\approx_{\tau,\pi}^{n}$ refines every $\approx_{\tau,\pi}^{m}$ , $m<n$ , a sequence of witnesses exists where $J^{(n)}=J$ for every $n$ . A strongly connected $J\subseteq G$ with this property is said to be $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight. We denote by $\mathcal{J}$ the set of all $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight subsets of $G$ .

Proposition 4.2

An algebra $\mathcal{G}=(G,W)$ is outside of ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ iff $G$ has a nonsingleton $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight subset.

Proof. It suffices to prove that if $\mathcal{G}\not\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ , then $G$ contains a $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight pair. Assume that $\mathcal{G}\not\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ and that in every set of witnesses where $\varphi(s_{i})=k$ and $\varphi(s^{\prime}_{i})=k^{\prime}$ for every $i$ , at least one of $k$ and $k^{\prime}$ is not accessible from the other. Let this be $k$ : denote by $F$ the strongly connected component to which $k$ belongs, so that $k$ is only accessible from elements of $F\cup G_{\text{in}}$ , $k^{\prime}\not\in F\cup G_{\text{in}}$ and $F\neq G_{\text{out}}$ . We denote by $\mathcal{G}^{\prime}$ the image of $\mathcal{G}$ by the homomorphism that maps $G-(F\cup G_{\text{in}})$ onto an absorbing element $\infty$ and works as the identity on $F\cup G_{\text{in}}$ . Then $\mathcal{G}^{\prime}\not\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ , $G_{\text{out}}^{\prime}=F\cup\{\infty\}$ and the hypothesis implies that $\infty$ belongs to every $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight pair of $\mathcal{G}^{\prime}$ . Since no subset $\{h,k\}$ of $F$ is $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight, there exists a large enough $n$ such that $\mathcal{H}^{n}_{\tau,\pi}=\alpha^{n}_{\tau,\pi}(\mathbb{F}_{A})$ recognizes $G_{\text{in}}$ and, given a forest $s\in\varphi^{-1}(G_{\text{out}}^{\prime})$ , determines which subset $L_{g}\cup L_{\infty}$ it belongs to. This means that for each $g\in G^{\prime}$ , there exists a first-order formula $\Phi_{g}(s)$ that takes value true iff $s\in\mathbb{H}_{A}$ satisfies $s\in L_{g}$ when $g\in G_{\text{in}}$ , and $s\in L_{g}\cup L_{\infty}$ when $g\in F$ .

Let $s\in L_{\infty}$ have a subtree $t\in L_{\infty}$ which is minimal in the sense that no strict subforest of $t$ is in $L_{\infty}$ ; this $t$ is of the form $x{\cdot}t^{\prime}$ where $x$ is the root of $t$ , $\lambda(x)=a$ and $g=\varphi(t^{\prime})$ , with $(\varphi(a\square))g=\infty$ . If $g\in G_{\text{in}}$ , then the formula $\Phi_{g}(t^{\prime})\wedge(\lambda(x)=a)$ asserts that $\varphi(s)=\infty$ . Otherwise, $g\in F$ : this holds only if for every $y\in nodes(t^{\prime})$ such that $\varphi(\Delta^{+}(t^{\prime},y))\in G_{\text{out}}^{\prime}$ , there exist $k\in F$ and $b\in A$ such that $\Phi_{k}(\Delta(t^{\prime},y))$ is true, $\lambda(y)=b$ and $(\varphi(b\square))k\neq\infty$ . Therefore, the existence in $s$ of a minimal subtree $t\in L_{\infty}$ can be asserted with a first-order formula. Next, the formulas developed in this way are used to deal with the case where $s\in L_{\infty}$ has no subtree $t\in L_{\infty}$ , i.e. $s=t_{1}+\cdots+t_{p}$ with $\varphi(t_{i})\in G_{\text{in}}\cup F$ for each $i$ . $\Box$

Proposition 4.3

Let algebra $\mathcal{G}=(G,W)$ be outside of ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ . There exists an integer $n_{0}$ such that, for every strongly connected set $J\subseteq G_{\text{out}}$ , if there exists a set of witnesses for $J$ and ${\approx^{n_{0}}_{\tau,\pi}}$ , then $J$ is $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight.

Proof. Let $\mathcal{G}\not\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ and let $J\subseteq G_{\text{out}}$ be strongly connected. If $J$ is not $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight, then no set of witnesses exists for $J$ and ${\approx^{n}_{\tau,\pi}}$ , for some $n\in\mathbb{N}$ , and no such set exists either for any $n^{\prime}\geq n$ . Denote by $n_{J}$ the smallest integer with this property. Since $G$ is finite, there is a maximum $N$ for the value of $n_{J}$ over all subsets $J$ that are not $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight. Define $n_{0}=N+1$ : the existence of a set of witnesses for $J$ and ${\approx^{n_{0}}_{\tau,\pi}}$ implies that $J$ is $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight. $\Box$

From now on, we denote by $n_{0}$ an integer that satisfies the conditions of Proposition 4.3, and such that ${\mathcal{H}^{n_{0}}_{\tau,\pi}}$ recognizes every language $L_{h}$ , as well as every $L^{\mathsf{tree}}_{h}=L_{h}\cap\mathbb{T}_{A}$ , $h\in G_{\text{in}}$ and every $L^{\mathsf{min}}_{g}$ for $g\in G_{\text{out}}$ minimal.

4.2 Recursive proofs for non-membership

In a recursive proof, a set $\mathcal{S}^{(n+1)}$ of witnesses is built by inserting copies of elements of $\mathcal{S}^{(n)}$ in the components of a circuit $\mathcal{M}^{(n+1)}$ . This circuit component is a tuple, i.e. a multicontext equipped with suitable port labelings, whose purpose is to specify, for every port $x$ , which witness from $\mathcal{S}^{(n)}$ , or rather copy thereof, is to be inserted at $x$ . The tuples in a circuit must be related to each other, in such a way that certain of the resulting forests mapped by $\varphi$ to distinct elements of the same $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight subset of $G$ are undistinguishable by the congruence $\approx^{n+1}_{\tau,\pi}$ . From this constraint we define a relation $\langle\star\,\mathcal{H}^{n}_{\tau,\pi}\rangle$ between tuples, such that if $t\langle\star\,\mathcal{H}^{n}_{\tau,\pi}\rangle t^{\prime}$ , then the forests built from $t$ and $t^{\prime}$ are equivalent under $\approx^{n+1}_{\tau,\pi}$ . This leads to the definition of Condition $\mathbf{RC}(\mathcal{G}\star\mathcal{H}^{n}_{\tau,\pi})$ , which states the existence of a circuit suitable to the construction of witnesses for the congruence $\approx^{n+1}_{\tau,\pi}$ , so that $\mathbf{RC}(\mathcal{G}\star\mathcal{H}^{n}_{\tau,\pi})$ implies $\mathcal{G}\not\in{{\mathbf{**}}^{n+1}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ .

In the other direction, Lemma 4.6 shows that a circuit that satisfies $\mathbf{RC}(\mathcal{G}\star\mathcal{H}^{n}_{\tau,\pi})$ can always be extracted from a set of witnesses for $\approx^{n+1}_{\tau,\pi}$ . As a consequence, a recursive proof of non-membership exists for every algebra outside of ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ . To prove Lemma 4.6 it is convenient to work in terms of “full” proofs, where the number of witnesses in $\mathcal{S}^{(n)}$ increases with $n$ ; this does not describe the actual proofs that exist in the literature. Theorem 4.7 shows that a “slender” proof always exists for every algebra outside of ${{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ .

We define a mapping $\iota:\bigcup_{n\geq n_{0}}H^{n}_{\tau,\pi}\rightarrow\mathcal{P}(G)$ , given by $\iota(h)=\{j\in G:h\in\alpha^{n}_{\tau,\pi}(L_{j})\}$ , which is not necessarily injective. By our assumption on $n_{0}$ , either $\iota(h)\subseteq G_{\text{in}}$ and is a singleton, or $\iota(h)\subseteq G_{\text{out}}$ and may have two or more elements. If $|\iota(h)|\geq 2$ , then every forest in the set $(\alpha^{n}_{\tau,\pi})^{-1}(h)$ can be used as a witness for $\iota(h)$ and $\approx^{n}_{\tau,\pi}$ , and therefore by Proposition 4.3, $\iota(h)$ is $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight. We define $\mathcal{J}=\{\,\iota(h):|\iota(h)|\geq 2\,\}$ and, for each $n\geq 2$ the set $\mathcal{D}^{(n)}=\{\,(h,j):h\in H^{n}_{\tau,\pi}\wedge\,j\in\iota(h)\,\}$ . We work in terms of $\mathcal{D}^{(n)}$ instead of $\{\,(h,j)\in\mathcal{D}^{(n)}:\iota(h)\in\mathcal{J}\}$ , that is, we include in the discussion those $h\in H^{n}_{\tau,\pi}$ for which $\iota(h)$ is a singleton; this will be useful in the proof of Lemma 4.6. We say that a set of forests $\mathcal{S}^{(n)}$ is full when it contains a forest $s_{k,\ell}$ for every $(k,\ell)\in\mathcal{D}^{(n)}$ , with the notations $\mathcal{S}^{(n)}=\{\,s^{(n)}_{k,\ell}:(k,\ell)\in\mathcal{D}^{(n)}\,\}$ . Similarly, we say that a circuit over $(A,\mathcal{D}^{(n)},\mathcal{D}^{(n+1)})$ is full when it contains a multicontext $m^{(n+1)}_{h,j}$ for every pair $(h,j)\in\mathcal{D}^{(n+1)}$ .

A sequence $\mathcal{S}^{(n)}$ , $n\geq 1$ , is a recursive proof when every forest $s^{(n+1)}_{h,j}$ , $h\in H^{n+1}_{\tau,\pi}$ , $j\in\iota(h)$ is built by inserting copies of elements of $\mathcal{S}^{(n)}$ at the ports of a multicontext $m^{(n+1)}_{h,j}$ that belongs to a circuit $\mathcal{M}^{(n+1)}$ over $(A,\mathcal{D}^{(n)},\mathcal{D}^{(n+1)})$ . Let $s$ and $m$ be shorthands for $s^{(n+1)}_{h,j}$ and $m^{(n+1)}_{h,j}$ , respectively. To every $x\in ports(m)$ , we assign two labels $\mu^{n}_{\tau,\pi}(x)\in H^{n}_{\tau,\pi}$ and $\psi(x)\in G$ . The labels are consistent when $\psi(x)\in\iota\circ\mu^{n}_{\tau,\pi}(x)$ , which is equivalent to $(\varphi^{-1}\circ\psi)(x)\cap((\alpha^{n}_{\tau,\pi})^{-1}\circ\mu^{n}_{\tau,\pi})(x)\neq\emptyset$ , and means that at least one forest $r$ exists that can be inserted at the port in a consistent way, i.e. such that $\varphi(r)=\psi(x)$ and $\alpha^{n}_{\tau,\pi}(r)=\mu^{n}_{\tau,\pi}(x)$ . Combined with $m$ , the mappings $\mu^{n}_{\tau,\pi}$ and $\psi$ define the tuple $t=(m,\mu^{n}_{\tau,\pi},\psi)$ ; it is said to be consistent if the two mappings are consistent at every port of $m$ . The leaf-completion of $m$ through each of these mappings will be used at several places.

We establish a relation between two consistent tuples $t=(m,\mu^{n}_{\tau,\pi},\psi)$ and $t^{\prime}=(m^{\prime},\mu^{n}_{\tau,\pi},\psi)$ . Let $\breve{\mu}^{n}_{\tau,\pi}(m)$ , a forest over $A\cup H^{n}_{\tau,\pi}$ , be the leaf-completion of $m$ through $\mu^{n}_{\tau,\pi}(x)$ ; define on it the leaf extension of $\alpha^{n}_{\tau,\pi}$ . Then in $\mathbb{F}_{A\cup H^{n}_{\tau,\pi}}$ , the nuclear congruence of $\alpha^{n}_{\tau,\pi}$ coincides with the congruence $\approx^{n+1}_{\tau,\pi}$ . Next, with $h\in H^{n}_{\tau,\pi}$ , $v\in V^{n}_{\tau,\pi}$ and $j\in G$ , let $P[\mathcal{H}^{n}_{\tau,\pi}](t)_{h,v,j}$ denote the number of ports $x\in ports(m)$ that satisfy $h=\mu^{n}_{\tau,\pi}(x)$ , $v=\alpha^{n}_{\tau,\pi}(\nabla(\breve{\mu}^{n}_{\tau,\pi}(m),x))$ and $j=\psi(x)$ . From this we define $P[\mathcal{H}^{n}_{\tau,\pi}](t)\equiv_{\tau,\pi}P[\mathcal{H}^{n}_{\tau,\pi}](t^{\prime})$ as a shorthand for

[TABLE]

Finally, we define the relation $\langle\star\mathcal{H}^{n}_{\tau,\pi}\rangle$ between tuples:

[TABLE]

We now describe the meaning of this relation. We want to determine conditions that are sufficient for two tuples $t=(m,\mu^{n}_{\tau,\pi},\psi)$ and $t^{\prime}=(m^{\prime},\mu^{n}_{\tau,\pi},\psi)$ to be suitable for the construction of witnesses $s$ and $s^{\prime}$ such that $s\approx^{n+1}_{\tau,\pi}s^{\prime}$ , while $\varphi(s)$ and $\varphi(s^{\prime})$ are distinct elements of the same $\approx^{\mathbb{N}}_{\tau,\pi}$ -tight subset of $G$ . The forests $s$ and $s^{\prime}$ are built from $t$ and $t^{\prime}$ by inserting at their ports elements of a set $\mathcal{S}^{(n)}$ of witnesses for $\approx^{n}_{\tau,\pi}$ ; the insertion at port $x$ of the forest $s(x)$ is consistent, in the sense defined above. Let $\tilde{s}$ and $\tilde{s}^{\prime}$ denote the relabeled versions of $s$ and $s^{\prime}$ , relative to $\alpha^{n}_{\tau,\pi}$ , and let $A^{n}=A\times H^{n}_{\tau,\pi}\times V^{n}_{\tau,\pi}$ be the corresponding relabeling alphabet. We assume that $\varphi(s)$ and $\varphi(s^{\prime})$ have the appropriate values and we look at how to satisfy the constraint $s\approx^{n+1}_{\tau,\pi}s^{\prime}$ . Satisfaction is obtained when every symbol of $A^{n}$ occurs the same number of times (up to $\equiv_{\tau,\pi}$ ) in $\tilde{s}$ and $\tilde{s}^{\prime}$ . We verify this separately on the nodes of the multicontexts of $m$ and $m^{\prime}$ and on those that belongs to the copies of elements of $\mathcal{S}^{(n)}$ that are inserted at their ports. Given $x\in ports(m)$ , we have by construction

[TABLE]

where the leftmost equality comes from the fact that in $s$ , the label of $x$ is the neutral letter $e$ . Therefore, for every interior node $y$ of $m$ ,

[TABLE]

which explains the component $\breve{\mu}^{n}_{\tau,\pi}(m)\approx^{n+1}_{\tau,\pi}\breve{\mu}^{n}_{\tau,\pi}(m^{\prime})$ in the definition of $t\ \langle\star\mathcal{H}^{n}_{\tau,\pi}\rangle\ t^{\prime}$ . Next, we deal with the copies of witnesses from $\mathcal{S}^{(n)}$ inserted at the ports of $m$ and $m^{\prime}$ . Define as above the counter $P[\mathcal{H}^{n}_{\tau,\pi}](t)_{h,v,j}$ and the shorthand $P[\mathcal{H}^{n}_{\tau,\pi}](t)\equiv_{\tau,\pi}P[\mathcal{H}^{n}_{\tau,\pi}](t^{\prime})$ . The witness $s(x)$ is a copy of $s_{h,j}^{(n)}$ ; let $y$ be one of its nodes, with $a=\lambda(s_{h,j}^{(n)},y)$ , we have

[TABLE]

and

[TABLE]

so that $\lambda(\tilde{s},y)=\langle a,k,vw\rangle$ . If $P[\mathcal{H}^{n}_{\tau,\pi}](t)\equiv_{\tau,\pi}P[\mathcal{H}^{n}_{\tau,\pi}](t^{\prime})$ , then for every combination of $s_{h,j}^{(n)}\in\mathcal{S}^{(n)}$ and $v\in V^{n}_{\tau,\pi}$ , the numbers of copies of $s_{h,j}^{(n)}$ that are inserted at ports $x$ such that $v=\alpha^{n}_{\tau,\pi}(\nabla(\breve{\mu}^{n}_{\tau,\pi}(m),x))$ are the same in $m$ and $m^{\prime}$ , up to $\equiv_{\tau,\pi}$ .

Definition 4.4

*We denote by $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ the existence of a circuit $\mathcal{M}^{(n+1)}$ over $(A,\mathcal{D}^{(n)},\mathcal{D}^{(n+1)})$ such that

$i.$

every tuple $t^{(n+1)}_{h,j}=(m,\mu^{n}_{\tau,\pi},\psi)$ is consistent and satisfies $\alpha^{n+1}_{\tau,\pi}(\breve{\mu}^{n}_{\tau,\pi}(m))=h$ and $\varphi(\breve{\psi}(m))=j$ , and

$ii.$

every combination of $t^{(n+1)}_{h,j}$ and $t^{(n+1)}_{h,j^{\prime}}$ satisfies condition $t^{(n+1)}_{h,j}\ \langle\star\mathcal{H}^{n}_{\tau,\pi}\rangle\ t^{(n+1)}_{h,j^{\prime}}$ .

Lemma 4.5

Let $n\geq 2$ : if a circuit $\mathcal{M}^{(n+1)}$ satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ , then it also satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n-1}_{\tau,\pi})$ , and can be used to prove $\mathcal{G}\not\in{{\mathbf{**}}^{n+1}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ .

Proof. Let $\mathcal{M}^{(n+1)}$ satisfy $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ . Define the surjective homomorphism $\sigma:\mathcal{H}^{n}_{\tau,\pi}\rightarrow\mathcal{H}^{n-1}_{\tau,\pi}$ by $\sigma=\alpha^{n-1}_{\tau,\pi}\circ(\alpha^{n}_{\tau,\pi})^{-1}$ . Then from every consistent tuple $(m,\mu^{n}_{\tau,\pi},\psi)$ in $\mathcal{M}^{(n+1)}$ one can build a consistent $(m,\mu^{n-1}_{\tau,\pi},\psi)$ , where $\mu^{n-1}_{\tau,\pi}=\sigma\circ\mu^{n}_{\tau,\pi}$ . We define from $(m,\mu^{n-1}_{\tau,\pi},\psi)$ its leaf completion $\breve{\mu}^{n-1}_{\tau,\pi}(m)$ ; we have

[TABLE]

and, for every combination of $h\in H^{n}_{\tau,\pi}$ , $v\in V^{n}_{\tau,\pi}$ and $j\in G$ , the counters $P[\mathcal{H}^{n-1}_{\tau,\pi}](t)_{h,v,j}$ satisfy

[TABLE]

Therefore, if $\mathcal{S}^{(n-1)}$ is a vector of witnesses for $\approx^{n-1}_{\tau,\pi}$ , then $\mathcal{M}^{(n+1)}{\cdot}\mathcal{S}^{(n-1)}$ is a vector of witnesses for $\approx^{n}_{\tau,\pi}$ . Note that since $|H^{n+1}_{\tau,\pi}|>|H^{n}_{\tau,\pi}|$ , some pairs $(h,j)\in\mathcal{D}^{(n)}$ end up with more than one witness.

Sets of witnesses $\mathcal{S}^{(1)},\ldots,\mathcal{S}^{(n)}$ can be built from a circuit $\mathcal{M}^{(n+1)}$ that satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ , as follows. First, select a set $\mathcal{S}^{(0)}$ that contains a forest $s_{J,j}$ for every $(J,j)$ and $h$ with $J=\iota(h)$ and $(h,j)\in\mathcal{D}^{(n)}$ , such that $\varphi(s_{J,j})=j$ , and build $\mathcal{S}^{(1)}=\mathcal{M}{\cdot}\mathcal{S}^{(0)}$ : for any two tuples $t^{(n+1)}_{h,j}$ and $t^{(n+1)}_{h,j^{\prime}}$ and the corresponding forests $s^{(1)}_{h,j}$ and $s^{(1)}_{h,j^{\prime}}$ , having $t^{(n+1)}_{h,j}\,(\mathcal{G}\star\mathcal{H}^{n}_{\tau,\pi})\,t^{(n+1)}_{h,j^{\prime}}$ ensures that every $a\in A$ occurs the same number of times (up to $\equiv_{\tau,\pi}$ ) in $s^{(1)}_{h,j}$ and $s^{(1)}_{h,j^{\prime}}$ , so that $\mathcal{S}^{(1)}$ constitutes a set of witnesses for $\approx^{1}_{\tau,\pi}$ . Then recursively, $\mathcal{S}^{(n)}=\mathcal{M}{\cdot}\mathcal{S}^{(n-1)}=\mathcal{M}^{n}{\cdot}\mathcal{S}^{(0)}$ , for every $n\geq 2$ . $\Box$

Lemma 4.6

$\mathcal{G}\not\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ * if, and only if there exists a sequence of full circuits $\mathcal{M}^{(n+1)}$ , $n\geq 1$ , that satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ .*

Proof. By the above discussion, satisfaction of $\mathbf{RC}(\mathcal{G}\star\mathcal{H}^{n}_{\tau,\pi})$ by every $\mathcal{M}^{(n+1)}$ , $n\geq 1$ , ensures that this sequence of circuits can be used to build a recursive proof. For the only if direction, assuming $\mathcal{G}\not\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ , we want to prove the existence of a sequence of circuits where each $\mathcal{M}^{(n+1)}$ , $n\geq 1$ , satisfies $\mathbf{RC}(\mathcal{G}\star\mathcal{H}^{n}_{\tau,\pi})$ . By Lemma 4.5, it suffices to prove this for $n$ above a large enough threshold: we select for this the parameter $n_{0}$ defined at the end of Section 4.1.

The proof uses the algebra $\mathcal{G}^{\prime}=\mathcal{G}\times\mathcal{H}^{n}_{\tau,\pi}$ ; since $\mathcal{G}\not\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ , we have $\mathcal{G}^{\prime}\not\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ . With the notations $\mathcal{G}^{\prime}=(G^{\prime},W^{\prime})$ and $G_{\text{out}}^{\prime}=\{g^{\prime}\in G^{\prime}:L_{g^{\prime}}\not\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle\}$ , we observe that every $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight subset of $G_{\text{out}}^{\prime}$ can be written $J^{\prime}=J\times\{h\}$ , where $J\subseteq\iota(h)$ and $\iota(h)$ is $\approx^{\mathbb{N}}_{\tau,\pi}$ -tight. To see this, consider two elements $(j,h)$ and $(k,\ell)$ of $J^{\prime}$ . If $h\neq\ell$ , then $\mathcal{H}^{n}_{\tau,\pi}$ can tell apart the languages $L_{(j,h)}$ and $L_{(k,\ell)}$ , which means that $J^{\prime}$ is not $\approx^{\mathbb{N}}_{\tau,\pi}$ -tight. Next, if $h=\ell$ but $j$ and $k$ do not belong to the same $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight subset of $G_{\text{out}}$ , then by Proposition 4.3, $\alpha^{n}_{\tau,\pi}$ maps the sets $\varphi^{-1}(j)$ and $\varphi^{-1}(k)$ onto distinct subsets (actually, singletons) of $H^{n}_{\tau,\pi}$ . Meanwhile, given $h\in H^{n}_{\tau,\pi}$ , a contradiction argument shows that if $\iota(h)$ is a $\approx^{\mathbb{N}}_{\tau,\pi}$ -tight subset of $G_{\text{out}}$ , then $\iota(h)\times\{h\}$ is also $\approx^{\mathbb{N}}_{\tau,\pi}$ -tight.

The proof consists in showing that any full vector of witnesses for $\approx^{n+1}_{\tau,\pi}$ can be used to define a full circuit $\mathcal{M}^{(n+1)}$ that satisfies $\mathbf{RC}(\mathcal{G}\star\mathcal{H}^{n}_{\tau,\pi})$ . Let $s$ be such a witness; we build a multicontext $m$ from $s$ as follows. In $m$ , every leaf is a port; the interior nodes, i.e. the set $interior(m)$ , are exactly those $y\in nodes(s)$ for which $\Delta(s,y)$ contains at least one pathhead for $G_{\text{out}}^{\prime}$ ; in other words, every strict ancestor of a pathhead becomes an interior node of $m$ . Next, we insert a port $x$ along every edge $(y,z)$ where $y\in interior(m)$ and $z\in nodes(s)-interior(m)$ , and we remove the subtree of $s$ rooted at $z$ . We will prove that the resulting multicontext has the required properties. Let $y\in nodes(s)$ and let $\tilde{s}$ denote the version of $s$ relabeled relative to $\alpha^{n}_{\tau,\pi}$ : we show that the triple $\lambda(\tilde{s},y)=\langle\lambda(s,y),\,\alpha^{n}_{\tau,\pi}(\Delta(s,y)),\,\alpha^{n}_{\tau,\pi}(\nabla(s,y))\rangle$ can be used to determine whether $y$ is a pathhead for $G_{\text{out}}^{\prime}$ . Let $\mathbb{T}_{A}$ denote the set of all trees over $A$ , $\mathcal{J}$ the set of all $\approx^{\mathbb{N}}_{\tau,\pi}$ -tight subsets of $G_{\text{out}}$ , and define

[TABLE]

Since $\mathbb{T}_{A}$ is recognized by every algebra $\mathcal{H}^{2}_{\tau,\pi}$ where $\tau\geq 1$ , each set $(\alpha^{n}_{\tau,\pi})^{-1}(h)$ is either a subset of $\mathbb{T}_{A}$ or disjoint with it. Then the set of all forests that contain a pathhead for $G_{\text{out}}^{\prime}$ is recognized by $\mathcal{H}^{n}_{\tau,\pi}$ through $\alpha^{n}_{\tau,\pi}$ and the set $H_{\text{ph}}$ of all elements of $H^{n}_{\tau,\pi}$ accessible from $H_{\text{Tree}}$ . A node $y$ is a pathhead for $G_{\text{out}}^{\prime}$ when $\alpha^{n}_{\tau,\pi}(\Delta^{+}(s,y))$ belongs to $H_{\text{ph}}$ and $\alpha^{n}_{\tau,\pi}(\Delta(s,y))$ does not. The nodes of $m$ are those $y\in nodes(s)$ for which $\alpha^{n}_{\tau,\pi}(\Delta(s,y))\in H_{\text{ph}}$ , that is, $y$ is a strict ancestor of at least one pathhead. To build the ports of $m$ , we take each father-son pair $y,z$ in $s$ where $y\in interior(m)$ and $z\not\in interior(m)$ , and insert a port $x$ between the two, so that $\Delta(s,x)=\Delta^{+}(s,z)$ and also $\beta(\nabla(s,x))=\beta(\nabla(s,z))$ for every homomorphism $\beta$ . By construction, $m$ and $x$ satisfy $\alpha^{n}_{\tau,\pi}(\nabla(\breve{\mu}^{n}_{\tau,\pi}(m),x))=\alpha^{n}_{\tau,\pi}(\nabla(s,z))$ , $\alpha^{n}_{\tau,\pi}(\breve{\mu}^{n}_{\tau,\pi}(m))=\alpha^{n}_{\tau,\pi}(s)$ and $\varphi(\breve{\psi}(m))=\varphi(s)$ . We also observe that $\varphi(\Delta(s,x))=\varphi(\Delta^{+}(s,z))$ is determined by $\alpha^{n}_{\tau,\pi}(\Delta^{+}(s,z))$ . Indeed, either $z$ is not a pathhead and the set $\iota(\alpha^{n}_{\tau,\pi}(\Delta^{+}(s,z)))$ is not $\approx^{\mathbb{N}}_{\tau,\pi}$ -tight, or $z$ is a pathhead, a case discussed in the proof of Proposition 4.1. Hence, the value of $\lambda(\tilde{s},z)$ determines both $\mu^{n}_{\tau,\pi}(x)=\alpha^{n}_{\tau,\pi}(\Delta^{+}(s,z))$ and $\psi(x)=\varphi(\Delta^{+}(s,z))$ ; from there, it determines which counter $P[\mathcal{H}^{n}_{\tau,\pi}](t)_{h,v,j}$ the port $x$ contributes to. Given a vector of integers $\vec{P}$ , the algebra $\mathcal{H}^{n+1}_{\tau,\pi}$ can therefore recognize whether $P[\mathcal{H}^{n}_{\tau,\pi}](t)_{h,v,j}\equiv_{\tau,\pi}\vec{P}_{h,v,j}$ . As a consequence, given another witness $s^{\prime}$ and the multicontext $m^{\prime}$ built from $s^{\prime}$ , if $s^{\prime}\approx^{n+1}_{\tau,\pi}s$ , then the corresponding tuples $t$ and $t^{\prime}$ satisfy $P[\mathcal{H}^{n}_{\tau,\pi}](t)\equiv_{\tau,\pi}P[\mathcal{H}^{n}_{\tau,\pi}](t^{\prime})$ . Finally, the fact that the interior nodes of $m$ are exactly those $y\in nodes(s)$ for which $\Delta(s,y)$ contains a pathhead, means that the labels they carry in $\tilde{s}$ are distinct from those of the other nodes of $s$ . Therefore, defining $\mu^{n}_{\tau,\pi}(x)=\alpha^{n}_{\tau,\pi}(\Delta(s,x))$ for every port of $m$ , we obtain $\breve{\mu}^{n}_{\tau,\pi}(m)\approx^{n+1}_{\tau,\pi}\breve{\mu}^{n}_{\tau,\pi}(m^{\prime})$ . Defining further $\psi(x)=\varphi(\Delta(s,x))$ , we build tuples that are equivalent under $\langle\star\mathcal{H}^{n}_{\tau,\pi}\rangle$ and thus constitute a circuit that satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ . $\Box$

The number of forests in a full vector of witnesses $\mathcal{S}^{(n)}$ increases with $n$ , while the existing examples of proofs describe sequences $\mathcal{S}^{(n)}$ , $n\geq 1$ , where all vectors have the same size. We say that a proof is slender when, for every combination of a $\approx^{\mathbb{N}}_{\tau,\pi}$ -tight set $J$ and of $j\in J$ , it contains at most one witness $s^{(n)}_{h,j}$ such that $\iota(h)=J$ .

Theorem 4.7

We have $\mathcal{G}\not\in{{\mathbf{**}}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle\$ iff $\mathcal{G}$ has a slender recursive proof of non-membership which for each $n$ , involves a circuit that satisfies Condition $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ . $\square$

Proof. It suffices to prove the only if direction. Recall that $\mathcal{J}$ denotes the set of all $\approx^{\mathbb{N}}_{\tau,\pi}$ -tight subsets of $G$ . Given $n\geq 1$ , we define $\delta_{n}(\mathcal{J})=\{\,h\in H^{n}_{\tau,\pi}:\iota(h)\in\mathcal{J}\,\}$ , we pick a representative of every class for the equivalence relation $\delta_{n}\circ\iota(h)$ , and define $\rho_{n}:\delta_{n}(\mathcal{J})\rightarrow\delta_{n}(\mathcal{J})$ which maps every $h\in\delta_{n}(\mathcal{J})$ to its representative.

Let $t=(m,\mu^{n}_{\tau,\pi},\psi)$ and $t^{\prime}=(m^{\prime},\mu^{n}_{\tau,\pi},\psi)$ be two tuples in $\mathcal{M}^{(n+1)}$ , let $s$ and $s^{\prime}$ denote the witnesses built from them, with $s\approx^{n+1}_{\tau,\pi}s^{\prime}$ and $\varphi(s)\neq\varphi(s^{\prime})$ . In order to use $m$ in a slender proof, we modify the labeling of every $x\in ports(m)$ , we replacing $\mu^{n}_{\tau,\pi}(x)$ with $\rho_{n}(\mu^{n}_{\tau,\pi}(x))$ . We then build from $m$ a forest $s_{\rho}$ by inserting at every port $x$ a copy of $s^{(n)}_{\rho_{n}(\mu^{n}_{\tau,\pi}(x)),\psi(x)}$ , (instead of $s^{(n)}_{\mu^{n}_{\tau,\pi}(x),\psi(x)}$ ); the construction of $s_{\rho}$ uses a slender subset $\mathcal{S}_{\rho}^{(n)}$ of $\mathcal{S}^{(n)}$ , where $s^{(n)}_{h,j}\in\mathcal{S}_{\rho}^{(n)}$ iff $h=\rho_{n}(h)$ . By construction, we have $\varphi(s_{\rho})=\varphi(s)$ . We build $s_{\rho}^{\prime}$ from $m^{\prime}$ in the same way. Verifying that $s_{\rho}\approx^{n+1}_{\tau,\pi}s_{\rho}^{\prime}$ will lead us to conclude that $s_{\rho}$ and $s_{\rho}^{\prime}$ are witnesses for the same $\approx^{\mathbb{N}}_{\tau,\pi}$ -tight set as $s$ and $s^{\prime}$ . This is done by induction on $n$ , proving for every $i\leq n$ and every pair $x,x^{\prime}$ of nodes or ports in $m$ and $m^{\prime}$ that if $\Delta(s,x)\approx^{i}_{\tau,\pi}\Delta(s^{\prime},x^{\prime})$ and $\nabla(s,x)\approx^{i}_{\tau,\pi}\nabla(s^{\prime},x^{\prime})$ , then $\Delta(s_{\rho},x)\approx^{i}_{\tau,\pi}\Delta(s_{\rho}^{\prime},x^{\prime})$ and $\nabla(s_{\rho},x)\approx^{i}_{\tau,\pi}\nabla(s_{\rho}^{\prime},x^{\prime})$ hold as well. As a consequence, those nodes of $m$ and $m^{\prime}$ that carry the same label (element of $A^{n}$ ) in the relabeled versions of $s$ and $s^{\prime}$ will also carry the same label (but possibly not the original one) in the relabeled versions of $s_{\rho}$ and $s_{\rho}^{\prime}$ ; the tuples $t_{\rho}$ and $t_{\rho}^{\prime}$ built from them are equivalent under $\langle\star\mathcal{H}^{n}_{\tau,\pi}\rangle$ . Therefore, with this method one can start with a sequence $\mathcal{S}^{(n)}$ , $n\geq 1$ , of full sets of witnesses and build from it a slender proof. $\Box$

If $\mathcal{I}$ is a subset of $\mathcal{L}$ such that $\bigcup_{J\in\mathcal{I}}J$ is strongly connected and is maximal relative to this property, then the witnesses for $\mathcal{I}$ constitute by themselves a recursive proof. We can therefore from now restrict our work on proofs that involve sets $\mathcal{L}$ where $\bigcup_{J\in\mathcal{L}}J$ is strongly connected.

Let $\mathbf{NRC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ denote the negation of $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ ; by Lemma 4.5, it states that there does not exist witnesses for $\mathcal{G}$ relatively to $\mathcal{H}^{n}_{\tau,\pi}$ . It is tempting to try to obtain from it a simpler formula, such as $\forall\,t,t^{\prime}:t\,\langle\star\mathcal{H}^{n}_{\tau,\pi}\rangle\,t^{\prime}\Rightarrow\varphi(t)=\varphi(t^{\prime})$ , which is an equational definition for a variety of forest algebras. This does not work, because $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ applies only within the strongly connected components of $G$ . For example, consider the language $L$ over $A=\{a,b\}$ defined by at least one node is an ancestor of exactly one node labelled $b$ , whose syntactic algebra belongs to $\mathbf{**}^{2}\langle\!\langle\mathbb{N}_{2,1}\rangle\!\rangle$ , and the tuples $t=\breve{\psi}(m)$ and $t^{\prime}=\breve{\psi}^{\prime}(m)$ built over the multicontext $m=a(a(x_{1}+x_{2})+a(x_{3}+x_{4}))$ with three interior nodes (labelled $a$ ) and four ports where the mappings $\psi$ and $\psi^{\prime}$ are defined by $\psi(x_{1})=\psi(x_{3})=\psi^{\prime}(x_{1})=\psi^{\prime}(x_{2})=a$ and $\psi(x_{2})=\psi(x_{4})=\psi^{\prime}(x_{3})=\psi^{\prime}(x_{4})=b$ , so that $\breve{\psi}(m)\in L$ , $\breve{\psi}^{\prime}(m)\not\in L$ , and $t\,\langle\star\mathcal{H}^{2}_{2,1}\rangle\,t^{\prime}$ .

4.3 Uniform proofs

Recall that $\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}$ denotes the equivalence-under-pumping congruence in $\mathbb{F}_{A,B}$ , and $\mathcal{J}^{\sigma,\rho}=(J^{\sigma,\rho},U^{\sigma,\rho})$ the quotient algebra. We use the same notations for the corresponding objects defined over $\mathbb{F}_{A}$ . Given a tuple ${t}=(m,\nu,\psi)$ , for every combination of $v\in U^{\sigma,\rho}$ and $(J,j)\in\mathcal{D}$ we define the counter

[TABLE]

From this, we define a relation in $\mathbb{M}_{A,D}$ :

[TABLE]

In a circuit $\mathcal{M}^{(n+1)}$ found in a proof-by-pumping, every subcircuit $T^{(n+1)}_{J}$ is diagonal and closed under $\langle\star\mathcal{J}^{\sigma,\rho}\rangle$ : we denote by $\mathbf{RC}(\mathcal{G}{\star}\mathcal{J}^{\sigma,\rho})$ the proposition that states the existence of such a circuit $\mathcal{M}^{(n+1)}$ . Since $\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}$ refines $\approx^{n+1}_{\tau,\pi}$ , the relation $\langle\star\mathcal{J}^{\sigma,\rho}\rangle$ refines $\langle\star\mathcal{H}^{n}_{\tau,\pi}\rangle$ , and therefore $\mathbf{RC}(\mathcal{G}{\star}\mathcal{J}^{\sigma,\rho})$ implies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ .

Consider an algebra $\mathcal{G}\in{\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ and let $\sigma$ and $\rho$ be the threshold and period of $\mathcal{G}_{\%}$ . This means that $\mathcal{G}$ belongs to the variety ${\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle\!\wedge\!\langle\!\langle\mathcal{J}^{\sigma,\rho}\rangle\!\rangle$ , where $\langle\!\langle\mathcal{J}^{\sigma,\rho}\rangle\!\rangle$ is the variety of finite forest algebras generated by $\mathcal{J}^{\sigma,\rho}$ . We can repeat with the congruences $\left(\approx^{n}_{\tau,\pi}\cap\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}\right)$ , for every $n\geq 1$ , the work done for Proposition 2.2 and Theorem 4.7 and assert the existence of an algebra $\mathcal{K}^{n,\sigma,\rho}_{\tau,\pi}=\mathbb{F}_{A}/\left({\approx^{n}_{\tau,\pi}\cap\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}}\right)$ such that $\mathcal{G}\in{\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle\!\wedge\!\langle\!\langle\mathcal{J}^{\sigma,\rho}\rangle\!\rangle$ if, and only if $\mathbf{NRC}(\mathcal{G}{\star}\mathcal{K}^{n,\sigma,\rho}_{\tau,\pi})$ for some $n\geq 1$ . Since $\mathcal{K}^{n,\sigma,\rho}_{\tau,\pi}\prec\mathcal{J}^{\sigma,\rho}$ , this implies $\mathbf{NRC}(\mathcal{G}{\star}\mathcal{J}^{\sigma,\rho})$ . We obtain the following.

Proposition 4.8

With $\sigma$ and $\rho$ the threshold and period of $\mathcal{G}_{\%}$ , if $\mathcal{G}\in{\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ , then $\mathbf{NRC}(\mathcal{G}{\star}\mathcal{J}^{\sigma,\rho})$ .

The values of $\sigma$ and $\rho$ are computable in finite time; with them in hand, determining the existence of a circuit $\mathcal{T}$ that satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{J}^{\sigma,\rho})$ is a recursively enumerable problem. If the corresponding algorithm stops, then $\mathcal{G}\not\in{\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ . However, it is not clear whether a whole proof can be built from $\mathcal{T}$ , that is, whether for every $n\geq 2$ , knowledge of $\mathcal{T}$ suffices to determine every set of witnesses $\mathcal{S}{(n)}$ . If so, then this would be a uniform proof of non-membership. The results of Section 4.2 make no mention of uniformity. However, the existing Ehrenfeucht-Fraïssé games are uniform and are built along two construction mechanisms, which we call proof-by-copy and proof-by-pumping, where the former is a special case of the latter.

4.3.1 Proof-by-copy

In certain cases such as the Boolean algebra (see Section 5.1), condition $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ can be satisfied with a circuit $\mathcal{T}$ where each subcircuit $T_{J}$ is built over the same multicontext, that is, $m_{J,j}\stackrel{{\scriptstyle\infty}}{{\leftrightarrow}}m_{J,j^{\prime}}$ for all $j,j^{\prime}\in J$ . Then $\mathcal{T}$ actually satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ for every $n\geq 1$ and can be used in the construction of every set of witnesses $\mathcal{S}^{(n)}$ , in the way described in the proof of Lemma 4.5, and proving $\mathcal{G}\not\in{\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ is just a matter of creating copies of $\mathcal{T}$ and assembling them444Observe that the depth of this construction increases linearly with $n$ . into witnesses, hence the name “proof-by-copy”. We verify that the existence of such a proof is a recursively enumerable problem.

Formally, we work with $\mathcal{G}=(G,W)=\varphi(\mathbb{F}_{A})$ , a connected component $F$ , the set $\mathcal{D}=\{(J,j):\emptyset\neq J\subseteq F,\,j\in J\}$ , and, to make things simpler, the assumption that $F$ is either the minimal ideal of $G$ , i.e. $wF\subseteq F$ for every $w\in W$ , or $F\cup\{\infty\}$ is an ideal, where $w\infty=\infty$ for every $w\in W$ . Given $J\subseteq F$ , we define $J_{\infty}=J\cup\{\infty\}$ . We use tuples $t=(m,\nu,\psi)\in\mathbb{M}_{A,D}$ with $(\nu,\psi):ports(m)\rightarrow\mathcal{D}$ . Since $\stackrel{{\scriptstyle\infty}}{{\leftrightarrow}}$ coincides with $=$ , up to horizontal permutation of siblings, we confound $\mathbb{F}_{A}$ with $\mathbb{F}_{A}/{\stackrel{{\scriptstyle\infty}}{{\leftrightarrow}}}$ and the canonical homomorphism with the identity mapping. The support of a tuple $t$ is

[TABLE]

its set of inputs is $inp(t)=inp(m)=\{J\subseteq F:\exists\,x\in ports(m),\,\nu(x)=J\}$ . Next, for every combination of $v\in\mathbb{V}_{A}$ and $(J,j)\in\mathcal{D}$ we define the counter

[TABLE]

with $P[\mathbb{F}_{A}](t)_{v,J,j}=0$ whenever $J\not\in inp(t)$ or $v\not\in sup(t)$ . Finally, we define a relation in $\mathbb{M}_{A,D}$ :

[TABLE]

A proof-by-copy for $\mathcal{G}$ , if it exists, can be found by an algorithm that traverses the set of all slender circuits $\mathcal{T}$ over subsets of $F$ , and stops when it has found one where every subcircuit $T_{J}$ is diagonal and closed under $\langle\star{}_{\tau,\pi}\rangle$ .

4.3.2 Proof-by-pumping

Other Ehrenfeucht-Fraïssé games are sequences of witnesses where each subcircuit of $\mathcal{T}^{(n+1)}$ is built over multicontexts that are equivalent under an “equivalence under pumping” congruence $\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}$ that refines $\approx^{n+1}_{\tau,\pi}$ ; hence the name “proof-by-pumping”. In the examples described in the next section, pumping is used in two different ways. In Section 5.3, the circuits consists in uniform multicontexts where all ports belong to the same set $Z$ and that are obtained from a single $m\in\mathbb{M}_{A,B}$ by pumping along $Z$ . In Section 5.4, work starts with a unique multicontext whose ports are partitioned into two sets $Y$ and $Z$ ; pumping $\theta$ times along $Z$ creates a multicontext denoted $p^{(1)}_{\theta}$ ; copies of $p^{(1)}_{\theta}$ and $p^{(1)}_{\theta+1}$ are then assembled, by insertion at the $Y$ -ports, into the actual components of the circuit used in the construction of the witnesses. In this section, we investigate the first of these two techniques; then we examine another situation where pumping is involved, but no example of which is known to the author.

An algorithm adapted from the one described at the end of Section 4.3.1 can verify the existence of a suitable $\mathcal{T}$ where every subcircuit $T_{J}$ is diagonal and closed under $\langle\star\mathcal{J}^{\sigma,\rho}\rangle$ , for a given pair $\sigma,\rho$ . The question then arises, whether it is sufficient to run this test for only one such pair, that is, whether given a suitable circuit $\mathcal{M}$ for some $\sigma$ , $\rho$ and $n$ , such that $\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}$ refines $\approx^{n+1}_{\tau,\pi}$ , one can for every sequence of pairs $\sigma_{i},\rho_{i}$ , $i\geq 1$ , for which $\stackrel{{\scriptstyle\sigma_{i},\rho_{i}}}{{\leftrightarrow}}$ refines $\approx^{n+i}_{\tau,\pi}$ , obtain by pumping the components of $\mathcal{M}$ a suitable circuit where every subcircuit is closed under $\langle\star\mathcal{J}^{\sigma_{i},\rho_{i}}\rangle$ .

We now develop tools that make it possible to explore pumping in some depth. Let $M\subset\mathbb{M}_{A,B}$ be a set closed under $\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}$ and let $Z\subset ports(M)$ be $\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}$ -stable, which means that pumping $M$ along $Z$ is possible. Let $\nu(x)=J$ for every $x\in Z$ and let $T$ be a set of tuples defined over $M$ , with for each $j\in J$ at least one element $t$ such that $\varphi(t)=j$ . We work with a tuple $\mathfrak{t}$ be defined on $\hat{\mathfrak{m}}\in M^{(\theta,Z)}$ and built from $T$ through consistent insertions, where a tuple $t\in T$ can be inserted at a port $x$ only if $\varphi(t)=\psi(x)$ .

Within $\hat{\mathfrak{m}}$ we associate to every copy $m$ of an element of $M$ the parameters depth and height, as follows. The depth level $d=1$ in $\hat{\mathfrak{m}}$ is a singleton $\bar{m}^{1}$ containing a unique copy of a multicontext of $M$ . Then recursively, depth level $d+1$ is the set $\bar{m}_{d+1}$ of all copies of elements of $M$ inserted at the ports of $Z(\bar{m}_{d})$ , which we call the $Z$ -ports of $\bar{m}_{d}$ . Then $m$ is at depth $d$ iff $m\in\bar{m}_{d}$ . We define the height together with the equivalence class $[[\hat{\mathfrak{m}}]]_{\sigma,\rho}=[[M^{(\theta,Z)}]]_{\sigma,\rho}$ of $\hat{\mathfrak{m}}\in M^{(\theta,Z)}$ under $\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}$ , as follows. If $\theta<\sigma$ , then $[[\hat{\mathfrak{m}}]]_{\sigma,\rho}=M^{(\theta,Z)}$ , all $Z$ -ports of $\hat{\mathfrak{m}}$ are located at depth $\theta$ , and the height of $m\in\bar{m}_{d}$ is defined as the class of $h=\theta-d$ under the congruence $\equiv_{\sigma,\rho}$ . Otherwise $\theta\geq\sigma$ , and the class $[[\hat{\mathfrak{m}}]]_{\sigma,\rho}$ consists in the union of all sets $M^{(\theta^{\prime},Z)}$ for which $\theta^{\prime}\equiv_{\sigma,\rho}\theta$ plus, because $\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}$ is a congruence, every multicontext obtained from an element of $[[\hat{\mathfrak{m}}]]_{\sigma,\rho}$ by replacing a sub-multicontext $\hat{\mathfrak{n}}$ with some $\hat{\mathfrak{n}}^{\prime}$ that satisfies $\hat{\mathfrak{n}}^{\prime}\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}\hat{\mathfrak{n}}$ .

Let $\hat{\mathfrak{m}}^{\prime}$ be obtained from $\hat{\mathfrak{m}}$ by this substitution, and let $x\in ports(\bar{m}_{d})$ be the port in which $\hat{\mathfrak{n}}$ has been replaced with $\hat{\mathfrak{n}}^{\prime}$ . If $\hat{\mathfrak{n}}\in M^{(\delta,Z)}$ for some $\delta<\sigma$ , then $\hat{\mathfrak{n}}^{\prime}\in M^{(\delta,Z)}$ , and the height of $x$ and of the $m\in\bar{m}_{d}$ it belongs to, is unmodified. Otherwise, $\hat{\mathfrak{n}}\in[[M^{(\delta,Z)}]]_{\sigma,\rho}$ with $\delta\geq\sigma$ , and the height of $\bar{n}_{1}$ , the top depth level in $\hat{\mathfrak{n}}$ , is the class of $h(n)=\theta-1$ under the congruence $\equiv_{\sigma,\rho}$ . Making the induction hypothesis, that $\hat{\mathfrak{n}}^{\prime}\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}\hat{\mathfrak{n}}$ implies $h(n^{\prime})\equiv_{\sigma,\rho}h(n)$ , we conclude that the height of $x$ and of the $m\in\bar{m}_{d}$ it belongs to, is unmodified.

Let $x$ be located in $\hat{\mathfrak{m}}$ at height $h$ and depth $d$ : the subforest $\Delta(\hat{\mathfrak{m}},x)$ belongs to $[[M^{(h,Z)}]]_{\sigma,\rho}$ and the context $\nabla(\hat{\mathfrak{m}},x)$ can be seen as the result of taking a multicontext in $[[M^{(d,Z)}]]_{\sigma,\rho}$ consisting in the top $d$ depth levels of $\hat{\mathfrak{m}}$ , and inserting an element of $[[M^{(h,Z)}]]_{\sigma,\rho}$ at every $Z$ -port of $\bar{m}_{d}$ other than $x$ . Next, given another multicontext $\hat{\mathfrak{m}}^{\prime}\in[[M^{(\theta^{\prime},Z)}]]_{\sigma,\rho}$ , $d^{\prime}\leq\theta^{\prime}$ and $x^{\prime}\in Z(\bar{m}_{d^{\prime}}^{\prime})$ , we have

[TABLE]

We say that the $Z$ -ports of $\hat{\mathfrak{m}}$ are at the bottom and that the other ports are on the side. Then, given $y\in ports(\bar{m}_{d})$ and $y^{\prime}\in ports(\bar{m}^{\prime}_{d^{\prime}})$ located on the sides, with $m(y)$ and $m^{\prime}(y^{\prime})$ the copies of elements of $M$ they belong to, we have

[TABLE]

Let $t=(\hat{\mathfrak{m}},\nu,\psi)$ and $t^{\prime}=(\hat{\mathfrak{m}}^{\prime},\nu,\psi)$ satisfy ${t}\,\langle\star\mathcal{J}^{\sigma,\rho}\rangle\,{t}^{\prime}$ , so that $\hat{\mathfrak{m}}\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}\hat{\mathfrak{m}}^{\prime}$ , and assume that $\hat{\mathfrak{m}}\in[[M^{(\delta,Z)}]]_{\sigma,\rho}$ with $\delta\geq\sigma$ . Let $y\in ports(t)$ and $y^{\prime}\in ports(t^{\prime})$ be located on the sides, at depth levels $d$ and $d^{\prime}$ , and heights $h$ and $h^{\prime}$ , respectively. For every $u\in U^{\sigma,\rho}$ , they contribute to the counters $P[\mathcal{J}^{\sigma,\rho}](t)_{u,I,i}$ and $P[\mathcal{J}^{\sigma,\rho}](t^{\prime})_{u,I,i}$ , only if $y\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}y^{\prime}$ . Depending on $d$ we distinguish three cases: if $d<\sigma$ , i.e. $y$ is located in the top $\sigma-1$ levels of $\hat{\mathfrak{m}}$ , this implies $d^{\prime}=d$ ; else if $h<\sigma$ , i.e. $y$ is located at height less than $\sigma$ in $\hat{\mathfrak{m}}$ , this implies $h^{\prime}=h$ . Hence if ${t}\,\langle\star\mathcal{J}^{\sigma,\rho}\rangle\,{t}^{\prime}$ , then the components of $P[\mathcal{J}^{\sigma,\rho}](t)$ and $P[\mathcal{J}^{\sigma,\rho}](t^{\prime})$ corresponding to the top and bottom regions match on a level-by-level basis. Otherwise $y$ is located in the “middle region” of $\hat{\mathfrak{m}}$ , where $d\geq\sigma$ and $h\geq\sigma$ , and where the ports are partitioned according to $(d-\sigma)\,\mathsf{mod}\,\rho$ and $(h-\sigma)\,\mathsf{mod}\,\rho$ , and the ports within the same class contribute to the same counters. The contributions of all ports within the same class are added up.

Until now, dealing with modular quantifier and arithmetics was no more complex than working only on the variety ${\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,1}\rangle\!\rangle=\mathsf{FO}[\prec]$ , i.e. the case where $\pi=1$ . The situation changes here, as there will be places where modular arithmetics adds an extra layer of complexity to the work being done. From now on, therefore, we restrict ourselves to the case where $\pi=1$ .

We now look at a type of circuit where the multicontexts of $T_{J}$ are obtained by pumping at the $Z$ -ports of a set $M$ , i.e. $M_{J}\subset[[M^{(\theta,Z)}]]_{\sigma,\rho}$ for some $\theta\in\mathbb{N}$ and the multicontexts of $T_{J}$ have $Z$ -ports that are available for further pumping. The sets of tuples with this property is recursively enumerable. We show that for every $\chi>\sigma$ a set $U_{J}$ closed under $\langle\star\mathcal{J}^{\chi,1}\rangle$ can be built from $T_{J}$ by pumping along $Z$ . If every subcircuit of a circuit $\mathcal{T}$ that satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{J}^{\sigma,1})$ falls into this case, then for every $\chi>\sigma$ one can obtain by pumping $\mathcal{T}$ a circuit $\mathcal{U}$ that satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{J}^{\chi,1})$ ; in other words, a uniform proof-by-pumping for $\mathcal{G}\not\in{\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,1}\rangle\!\rangle$ can always be built from $\mathcal{T}$ . Later in this section, we will look at another type of circuits and show that in some circumstances, $T_{J}$ cannot be used to build an entire proof.

Let the vector $P[\mathcal{J}^{\sigma,\rho}](t)_{Z}$ consist of the components of $P[\mathcal{J}^{\sigma,\rho}](t)$ associated to the $Z$ -ports, and $P[\mathcal{J}^{\sigma,\rho}](t)_{Y}$ consist in all the other components. With the notations $\bar{m}_{1}=\{m^{\mathsf{top}}\}$ and $\bar{m}^{\prime}_{1}=\{m^{\mathsf{top}}{}^{\prime}\}$ , we define tuples $t^{\mathsf{top}}=(m^{\mathsf{top}},\nu,\psi)$ and $t^{\mathsf{top}}{}^{\prime}=(m^{\mathsf{top}}{}^{\prime},\nu,\psi)$ from $t$ and $t^{\prime}$ ; we observe that if $t\,\langle\star\mathcal{J}^{\sigma,\rho}\rangle\,t^{\prime}$ , then $P[\mathcal{J}^{\sigma,\rho}](t^{\mathsf{top}})_{Y}\equiv_{\tau,\pi}P[\mathcal{J}^{\sigma,\rho}](t^{\mathsf{top}}{}^{\prime})_{Y}$ . Also, since $m^{\mathsf{top}}\stackrel{{\scriptstyle\sigma,\rho}}{{\leftrightarrow}}m^{\mathsf{top}}{}^{\prime}$ , we have $|Z(m^{\mathsf{top}})|\equiv_{\sigma,\rho}|Z(m^{\mathsf{top}}{}^{\prime})|$ . Given a subcircuit $T_{J}=\{t_{j}:j\in J\}$ of a circuit that satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{J}^{\sigma,\rho})$ , we define in the same way the sets $M^{\mathsf{top}}=\{m^{\mathsf{top}}_{j}:j\in J\}$ of multicontexts and $T^{\mathsf{top}}=\{t^{\mathsf{top}}_{j}:j\in J\}$ of tuples, and for every $\theta\geq 2$ we build by consistent pumping the set $T^{\mathsf{top}}{}^{(\theta,Z)}=\{t^{(\theta,Z)}_{j}:j\in J\}$ . We associate to every $Z$ -port $x$ of $M_{J}$ its depth level $\theta_{x}$ ; it satisfies $\theta_{x}\geq\sigma$ . Then we build $U_{J}$ in three steps, where the first two are:

$i.$

build the set $T^{\mathsf{top}}{}^{(\chi,Z)}$ ;

$ii.$

for each $j\in J$ , build $\tilde{u}_{j}$ from $t^{(\chi,Z)}_{j}$ , by inserting at every $Z$ -port $x$ a copy of $t_{\psi(x)}$ .

Every $Z$ -port $x$ of $\tilde{u}_{j}$ is a copy of some $Z$ -port $z$ of $T_{J}$ and is located at depth $\theta_{x}=\chi+\theta_{z}$ . For all $j,j^{\prime}\in J$ , the properties of $T^{\mathsf{top}}$ imply

[TABLE]

which means that the counters corresponding to the top $\chi$ levels of $\tilde{u}_{j}$ and $\tilde{u}_{j^{\prime}}$ match (under the $\equiv_{\tau,1}$ ). The third step is:

$iii.$

build $u_{j}$ by inserting, at every $Z$ -port $x$ of $\tilde{u}_{j}$ , a copy of $t_{\psi(x)}^{(\eta(x),Z)}$ .

To determine the integer $\eta(x)$ , we associate to $T^{\mathsf{top}}$ a $|J|\times|J|$ matrix $A$ with entries in $\mathbb{N}_{\tau,1}$ , where $A_{ij}=|\{x\in Z(t^{\mathsf{top}}_{i}):\psi(x)=j\}|$ . Then $(A^{\delta})_{ij}$ is the number of $Z$ -ports $x$ in $t_{i}^{(\delta,Z)}$ such that $\psi(x)=j$ . The powers of this matrix constitute a finite semigroup with idempotent element $A^{\omega}$ : if every $\eta(x)$ is a multiple of $\omega$ , then

[TABLE]

will hold for every pair $x,x^{\prime}$ of $Z$ -ports of $\tilde{u}_{j}$ that satisfy $\psi(x)=\psi(x^{\prime})$ . Since for all $j,j^{\prime}\in J$ and for all $t,t^{\prime}\in T_{J}$ ,

[TABLE]

we have $P[\mathcal{J}^{\chi,1}](\tilde{u}_{j})_{Z}\equiv_{\tau,1}P[\mathcal{J}^{\chi,1}](\tilde{u}_{j^{\prime}})_{Z}$ and from there $P[\mathcal{J}^{\chi,1}](u_{j})_{Z}\equiv_{\tau,1}P[\mathcal{J}^{\chi,1}](u_{j^{\prime}})_{Z}$ . We now claim that if every $\eta(x)$ satisfies, $\eta(x)\geq\tau+\chi$ , then $P[\mathcal{J}^{\chi,1}](u_{j})_{Y}\equiv_{\tau,1}P[\mathcal{J}^{\chi,1}](u_{j^{\prime}})_{Y}$ . This already holds for the top $\chi$ levels in ${u}_{j}$ and ${u}_{j^{\prime}}$ . Meanwhile, the lower $\chi$ levels in $t_{\psi(x)}^{(\eta(x),Z)}$ consist of copies of elements of $T^{\mathsf{top}}$ ; their numbers in ${u}_{j}$ and ${u}_{j^{\prime}}$ match, thanks to Equation (1); this ensures a match in the bottom region. The middle region consists in the tuples inserted at step ii and in elements of $T^{\mathsf{top}}{}^{(\xi(x)-\chi,Z)}$ , namely the tuples inserted at step iii minus their lowermost $\chi$ levels. By Equation (1) and the fact that $\eta(x)-\chi\geq\tau$ for every $Z$ -port $x$ of ${u}_{j}$ and ${u}_{j^{\prime}}$ , the contributions of the tuples inserted at step iii match with one another. Concerning the tuples inserted at step ii, let $t,t^{\prime}\in T_{J}$ and consider two non- $Z$ -ports $y,y^{\prime}$ , located at depth levels $d,d^{\prime}$ and height levels $h,h^{\prime}$ in $t$ and $t^{\prime}$ , respectively, denoting by $m(y)$ and $m^{\prime}(y^{\prime})$ the copies of multicontexts from $M$ they belong to. If $y\stackrel{{\scriptstyle\sigma,1}}{{\leftrightarrow}}y^{\prime}$ , then $\nabla(m(y),y)\stackrel{{\scriptstyle\sigma,1}}{{\leftrightarrow}}\nabla(m^{\prime}(y^{\prime}),y^{\prime})$ . The reverse implication works in $t,t^{\prime}$ only if $d\equiv_{\sigma,1}d^{\prime}$ and $h\equiv_{\sigma,1}h^{\prime}$ . Let $t$ be inserted at step ii of the construction of $\tilde{u}_{j}$ : the same port $y$ is, in $\tilde{u}_{j}$ , at depth $d+\chi\geq\chi$ and height at least $\tau+\chi+h\geq\chi$ ; then if $t^{\prime}$ is in the same $\tilde{u}_{j}$ , we have

[TABLE]

Then from this, from $P[\mathcal{J}^{\sigma,1}](t)_{Y}\equiv_{\tau,1}P[\mathcal{J}^{\sigma,1}](t^{\prime})_{Y}$ , and from the fact that the numbers of occurrences in $\tilde{u}_{j}$ and $\tilde{u}_{j^{\prime}}$ of elements of $T_{J}$ match because $|Z(t^{(\chi,Z)}_{j})|\equiv_{\tau,1}|Z(t^{(\chi,Z)}_{j^{\prime}})|$ , we conclude that the counters corresponding to the middle regions of $u_{j}$ and $u_{j^{\prime}}$ balance.

We now look at a circuit where $T_{J}$ has been built by pumping along $Z$ but has no $Z$ -ports, and the construction of $T_{J}$ uses two sets of multicontexts $M$ and $R$ closed under $\stackrel{{\scriptstyle\sigma,1}}{{\leftrightarrow}}$ , and is done in two steps: pumping at the $Z$ -ports of a set $M$ to obtain a set $S_{J}=\{s_{j}:j\in J\}$ , then for each $j\in J$ , inserting at every $Z$ -port $x$ of $s_{j}$ a tuple $r(x)\in R$ , which builds $t_{j}$ . This is a situation where $P[\mathcal{J}^{\sigma,1}](s)\equiv_{\tau,1}P[\mathcal{J}^{\sigma,1}](s^{\prime})$ is not satisfied by some tuples $s,s^{\prime}\in S_{J}$ (otherwise a suitable circuit could be built solely with $S_{J}$ , without the tuples from $R$ ), but where $P[\mathcal{J}^{\sigma,1}](t)\equiv_{\tau,1}P[\mathcal{J}^{\sigma,1}](t^{\prime})$ holds for all $t,t^{\prime}\in T_{J}$ . From the latter equivalence we deduce that $P[\mathcal{J}^{\sigma,1}](s)_{Y}\equiv_{\tau,1}P[\mathcal{J}^{\sigma,1}](s^{\prime})_{Y}$ holds for all $s,s^{\prime}$ , and we have $P[\mathcal{J}^{\sigma,1}](s)_{Z}\not\equiv_{\tau,1}P[\mathcal{J}^{\sigma,1}](s^{\prime})_{Z}$ for some $s,s^{\prime}$ . This makes possible a situation where a different tuple from $R$ has to be inserted at every port of the $s_{j}$ ’s in order for the resulting $t_{j}$ ’s to be equivalent under $\langle\star\mathcal{J}^{\sigma,1}\rangle$ : this means that tuples from $R$ inserted at the antichains of ports $Z(s_{j})$ can be correlated in a potentially intricate way.

We assume that using $R$ was necessary, namely that no set of tuples $S=\{s_{j}:j\in J\}$ built by pumping at the $Z$ -ports of any set $M$ closed under $\stackrel{{\scriptstyle\sigma,1}}{{\leftrightarrow}}$ can simultaneously be diagonal and closed under $\langle\star\mathcal{J}^{\sigma,1}\rangle$ . Equivalently, for every vector of counters $\vec{p}$ , there is a subset $J^{\prime}\subset J$ such that, for every $s\in S$ , having $P[\mathcal{J}^{\sigma,1}](s)\equiv_{\tau,1}\vec{p}$ implies $\varphi(s)\not\in J^{\prime}$ . Moreover, the set $R$ of tuples to be inserted at the ports of $S$ does not satisfy this either, for otherwise some subset $\{r_{j}:j\in J\}$ would be suitable to replace $T_{J}$ in the construction of a circuit that satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{J}^{\sigma,1})$ . Provided that $M$ in not itself expressible555This is the counterpart, in the world of trees and multicontexts, of the fact that the word language $\{w^{\theta}:\theta\geq\sigma\}$ is star-free, unless $w=x^{\pi}$ for some word $x$ and $\pi\geq 2$ . as $N^{(\pi,Z)}$ for $N\subset\mathbb{M}_{A,B}$ and $\pi\geq 2$ , one can prove by induction on the construction of $[[M^{(\sigma,Z)}]]_{\sigma,1}$ that this set is recognized by a forest algebra in ${\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,1}\rangle\!\rangle$ , and that the same holds for every intersection666Actually, one can prove that $[[T^{(\sigma,Z)}]]_{\sigma,1}$ is a union of a number of such classes. of the corresponding set of tuples $[[T^{(\sigma,Z)}]]_{\sigma,1}$ with an equivalence class for $\langle\star\mathcal{J}^{\sigma,1}\rangle$ ; let $\mathcal{H}_{1}\in{\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,1}\rangle\!\rangle$ recognize everyone of these intersections. Next, the constraint on $R$ implies that, for a large enough $n$ , no set of witnesses for $J$ and $\approx^{n}_{\tau,1}$ can be built from $R$ . Since $\stackrel{{\scriptstyle\sigma,1}}{{\leftrightarrow}}$ refines $\approx^{n}_{\tau,1}$ , this means that no equivalence class for $\langle\star\mathcal{J}^{\sigma,1}\rangle$ contains a diagonal subset of $R$ . Then in the very special case where each intersection of such a class with $R$ contains elements from at most one set $\varphi^{-1}(j)$ , $j\in J$ , and given the fact that $R$ is a set of tuples built over a set of multicontexts closed under $\stackrel{{\scriptstyle\sigma,1}}{{\leftrightarrow}}$ , there exists an algebra $\mathcal{H}_{2}\in{\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,1}\rangle\!\rangle$ that, for every $t\in T_{J}$ , determines the content of $P[\mathcal{J}^{\sigma,1}](s)_{Z}$ and the class of $s$ for $\langle\star\mathcal{J}^{\sigma,1}\rangle$ , and therefore the subset $J^{\prime}\subset J$ for which we have $\varphi(t)\not\in J^{\prime}$ . This means that for every $n^{\prime}\in\mathbb{N}$ such that both $\mathcal{H}_{1}\prec\mathcal{H}^{n^{\prime}}_{\tau,1}$ and $\mathcal{H}_{2}\prec\mathcal{H}^{n^{\prime}}_{\tau,1}$ hold, no set of tuples obtained by pumping $S$ and inserting elements of $R$ at the $Z$ -ports can constitute a subcircuit for $J$ in a circuit that satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n^{\prime}}_{\tau,1})$ . This special case is a situation where the existence of a circuit that satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{J}^{\sigma,1})$ , where $\stackrel{{\scriptstyle\sigma,1}}{{\leftrightarrow}}$ refines $\approx^{n+1}_{\tau,1}$ , does not imply the existence of a uniform proof-by-pumping for $\mathcal{G}\not\in{\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,1}\rangle\!\rangle$ .

5 Examples of proofs

We show how the techniques and notations of this article work on examples, of which three are discussed in the literature under different formalisms and methods.

5.1 The Boolean algebra

The Ehrenfeucht-Fraïssé game concerning this algebra, described in [20, Theorem 4.2], is an example of a proof-by-copy.

In the Boolean forest algebra $\mathcal{B}$ , the horizontal monoid is the direct product $B=\{\mathsf{0},\mathsf{1}\}\times\{\mathsf{0},\mathsf{1}\}$ of the AND and OR monoids (in the first and second component, respectively), so that $\langle\mathsf{1},\mathsf{0}\rangle$ is the identity and $\langle\mathsf{0},\mathsf{1}\rangle$ is absorbing. Besides the elements of the form $\varepsilon+\langle\mathsf{a},\mathsf{b}\rangle$ , the vertical monoid is generated by $\varphi(\wedge)$ and $\varphi(\vee)$ , which work on each $\langle\mathsf{a},\mathsf{b}\rangle\in B$ as projections on the first and second component, respectively. With $A=\{\wedge,\vee\}$ , the Boolean algebra can be regarded as the image of $\mathbb{F}_{A}$ by the homomorphism $\varphi$ ; in the special case of one-node forests, this gives

[TABLE]

We denote by $L_{\mathsf{0}}$ and $L_{\mathsf{1}}$ the languages $\varphi^{-1}(\langle\mathsf{0},\mathsf{0}\rangle)$ and $\varphi^{-1}(\langle\mathsf{1},\mathsf{1}\rangle)$ , respectively. The proof described in Potthoff [20, Theorem 4.2] builds witnesses from a unique circuit that consists in two tuples $t_{\mathsf{0}}=(m,\nu_{\mathsf{0}},\psi_{\mathsf{0}})$ and $t_{\mathsf{1}}=(m,\nu_{\mathsf{1}},\psi_{\mathsf{1}})$ defined over the strongly connected set $J=\{\langle\mathsf{0},\mathsf{0}\rangle,\langle\mathsf{1},\mathsf{1}\rangle\}$ , where the mappings $\nu_{0}$ and $\nu_{\mathsf{1}}$ are constant (map every port to $J$ ) while $\psi_{\mathsf{0}}$ and $\psi_{\mathsf{1}}$ are depicted on Figure 1; this circuit satisfies $\mathbf{RC}(\mathcal{B}{\star}\mathcal{H}^{n}_{\tau,\pi})$ for every combination of $\tau$ and $\pi$ .

5.2 The duplex Boolean formulas

The “duplex” technique is a way of building examples of algebras that demand proofs where the witnesses have to be built simultaneously for two $\approx^{\mathbb{N}}$ -tight sets. We describe the duplex Boolean formulas as an example. The language consists of those trees that satisfy the following conditions.

$\bullet$

There is at least one interior node.

$\bullet$

If a node $x$ is a leaf, then its label $\lambda(x)$ belongs to $\{\mathsf{0},\mathsf{1}\}$ ; otherwise $\lambda(x)\in\{\cap,\cup,\sqcap,\sqcup\}$ .

$\bullet$

The sons of a node with $\lambda(x)\in\{\cap,\sqcap\}$ are either two leaves, or four interior nodes $y,y^{\prime},z,z^{\prime}$ with $\lambda(y)=\lambda(y^{\prime})=\cup$ and $\lambda(z)=\lambda(z^{\prime})=\sqcup$ .

$\bullet$

The sons of a node with $\lambda(x)\in\{\cup,\sqcup\}$ are either two leaves, or four interior nodes $y,y^{\prime},z,z^{\prime}$ with $\lambda(y)=\lambda(y^{\prime})=\cap$ and $\lambda(z)=\lambda(z^{\prime})=\sqcap$ .

$\bullet$

Each node $x$ has a value $val(x)\in\{\mathsf{0},\mathsf{1},\bot\}$ , defined according to the following rules.

$\bullet$

A leaf evaluates to its label, e.g. if $\lambda(x)=\mathsf{1}$ , then $val(x)=\mathsf{1}$ .

$\bullet$

An interior node $x$ with two sons $w,w^{\prime}$ , which are leaves, evaluates to $val(w)\wedge val(w^{\prime})$ if $\lambda(x)\in\{\cap,\sqcap\}$ , and to $val(w)\vee val(w^{\prime})$ if $\lambda(x)\in\{\cup,\sqcup\}$ .

$\bullet$

If $x$ has four sons and $\lambda(x)\in\{\cap,\sqcap\}$ , then $val(x)\neq\bot$ iff none of its sons evaluates to $\bot$ and $val(y)\wedge val(y^{\prime})=val(z)\wedge val(z^{\prime})$ , in which case $val(x)=val(y)\wedge val(y^{\prime})$ ; in other words, the AND of the subforests with roots in $\{\cap,\cup\}$ and $\{\sqcap,\sqcup\}$ are evaluated separately, and $val(x)\neq\bot$ iff the values coincide. When $x$ has four sons and $\lambda(x)\in\{\cup,\sqcup\}$ , $val(x)$ is an OR, obtained in a similar way.

The language of those trees where $\lambda(root)=\cap$ and $val(root)=\mathsf{1}$ is not first-order definable. Witnesses for this are built from an array of forests $\mathcal{T}$ consisting in the trees $t(\circleddash,\mathsf{0})$ , $t(\circleddash,\mathsf{1})$ , $t(\boxdot,\mathsf{0})$ and $t(\boxdot,\mathsf{1})$ , and from the circuit $\mathcal{M}$ , consisting in $M(\circleddash)=\{m(\circleddash,\mathsf{0}),\,m(\circleddash,\mathsf{1})\}$ and $M(\boxdot)=\{m(\boxdot,\mathsf{0}),\,m(\boxdot,\mathsf{1})\}$ , over the alphabets $A=\{\mathsf{0},\mathsf{1},\cap,\cup,\sqcap,\sqcup\}$ and $B=\{\circleddash,\boxdot\}$ . Each multicontext consists in one root, with label $\cap$ in those of $M(\circleddash)$ and $\sqcap$ in $M(\boxdot)$ , then four nodes $y,y^{\prime},z,z^{\prime}$ with $\lambda(y)=\lambda(y^{\prime})=\cup$ and $\lambda(z)=\lambda(z^{\prime})=\sqcup$ ; under each of them are four ports. In $m(\circleddash,\mathsf{0})$ and $m(\boxdot,\mathsf{0})$ , two of the ports below node $y$ carry the label $(\circleddash,\mathsf{0})$ and the other two the label $(\boxdot,\mathsf{0})$ ; the same holds for the ports located under $z$ . Under the node $y^{\prime}$ , two ports carry the label $(\circleddash,\mathsf{1})$ and the other two $(\boxdot,\mathsf{1})$ ; the same holds for the ports located under $z^{\prime}$ . Thus, under nodes $y$ and $y^{\prime}$ , the $\mathsf{0}$ ’s and $\mathsf{1}$ ’s of the four port labels in $\{\circleddash\}\times\{\mathsf{0},\mathsf{1}\}$ follow the same pattern as the four ports of the $m_{\mathsf{0}}$ multicontext defined in the previous section. The same observation holds for the ports labels in $\{\boxdot\}\times\{\mathsf{0},\mathsf{1}\}$ , as well as the ports under $z$ and $z^{\prime}$ . Similarly, the port labels in $m(\circleddash,\mathsf{1})$ and $m(\boxdot,\mathsf{1})$ follow the pattern of the ports of $m_{\mathsf{1}}$ .

Each array of witnesses $\mathcal{S}^{(n)}=\mathcal{M}^{n}{\cdot}\mathcal{T}$ consists in two subarrays $S^{(n)}(\circleddash)$ and $S^{(n)}(\boxdot)$ , whose elements can be told apart by looking at the label of their roots.

5.3 Forest algebras with aperiodic vertical monoid

and uniform vertical confusion

We describe two examples of forest algebras where the vertical monoid is aperiodic and that have vertical confusion on uniform multicontexts.

The first example, discussed in [21], is the set $L_{\text{even}}$ of those binary trees over a trivial alphabet $A=\{{\bullet}\}$ where all leaves are located at a nonzero even depth. In its syntactic forest algebra, the horizontal monoid is $\{0,\mathbf{e},\mathbf{o},\mathbf{e}\mathbf{e},\mathbf{o}\mathbf{o},\infty\}$ , where [math] is the identity, $\mathbf{e}+\mathbf{e}=\mathbf{e}\mathbf{e}$ , $\mathbf{o}+\mathbf{o}=\mathbf{o}\mathbf{o}$ , and everything else evaluates to the absorbing element $\infty$ . The one-port context ${\bullet}\square$ is mapped to the vertical monoid element that maps [math] and $\mathbf{e}\mathbf{e}$ to $\mathbf{o}$ , $\mathbf{o}\mathbf{o}$ to $\mathbf{e}$ , and everything else to $\infty$ . The vertical monoid of its syntactic forest algebra777In this example and the next ones, the action on [math] of the generators of the vertical monoid is not defined; the size varies depending on how this gap is filled. The other properties that matter here are not affected.

is aperiodic, has $19$ elements; the non-idempotent ${\bullet}\square$ constitutes a $\mathcal{J}$ -class by itself; the others constitute three regular $\mathcal{J}$ -classes analogous to those of the Brandt monoid $B_{4}$ . Vertical confusion is verified with the multicontext with one interior node, that is the father of two ports, i.e. ${\bullet}(\square+\square)$ .

The Ehrenfeucht-Fraïssé game that proves that $L_{\text{even}}$ is not in $\mathsf{FO}[\prec]$ uses nontrivial uniform multicontexts whose interior nodes constitute the set $\{m_{d}:d\geq 1\}$ , where $m_{d}$ is the complete binary tree of depth $d$ over $A$ , and where each node at depth $d$ is the father of two ports. Given a labeling $\lambda:ports(m_{d})\rightarrow\{\mathbf{e},\mathbf{o},\mathbf{e}\mathbf{e},\mathbf{o}\mathbf{o}\}$ , we have $\breve{\varphi}(m_{d},\lambda)\neq\infty$ only if $\lambda$ is a constant mapping with image in $\{\mathbf{e},\mathbf{o}\}$ . A suitable set $\mathcal{S}^{1}$ consists in $s_{\mathbf{e}}$ and $s_{\mathbf{o}}$ , where $s_{\mathbf{e}}$ and $s_{\mathbf{o}}$ are built from $m_{2\tau}$ and $m_{2\tau+1}$ , respectively, by inserting $\mathbf{e}$ , at every port. Let $\sigma(n)=2\tau(2^{n+1}+1)$ . The circuit $\mathcal{M}^{(n+1)}$ consists in multicontexts $m_{\sigma(n)}$ . and $m_{\sigma(n)+1}$ , where all ports carry the same label, satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ . and therefore can be used to build $\mathcal{S}^{(n+1)}$ .

Our second example is a language defined as follows:

(1)

it consists of trees which are either one-node with label $\mathsf{0}$ , $\mathsf{1}$ , $\mathsf{2}$ or $\bot$ , or built recursively through insertions in the four-port uniform multicontext $m=a(b(y_{0}+y_{1})+b(z_{0}+z_{1}))$ ; node labels are taken in $\{a,b\}$ and port labels are undefined;

(2)

among them, exactly those which evaluate to $\mathsf{0}$ according to the two tables below belong to the language.

[TABLE]

Besides the identity [math] and the absorbing element $\infty$ , the horizontal monoid has one element for each of the following: a tree that evaluates to $\mathsf{0}$ , to $\mathsf{1}$ and to $\mathsf{2}$ , and a sum of two such trees. The vertical monoid of its syntactic forest algebra contains three regular $\mathcal{J}$ -classes, analogous to those of the Brandt monoid $B_{6}$ . Vertical confusion is verified with the multicontext $m$ given above.

5.4 The Potthoff Algebra

This is the syntactic algebra of a language of trees over $A=\{\mathsf{0},\mathsf{1},\bot,\vartriangle,\doteqdot\}$ defined as follows (see [21, Definition 25]):

(1)

it is included in a set which consists in three one-node trees with labels $\mathsf{0}$ , $\mathsf{1}$ or $\bot$ , and of all those trees built recursively through insertions in the three-port multicontext $m=\vartriangle(z+\doteqdot(y_{0}+y_{1}))$ , depicted on Figure 3, where the port are labelled with variable names;

(2)

it consists of exactly those trees which evaluate to $\mathsf{1}$ according to the following two tables.

[TABLE]

Condition (1) can be specified with a first-order formula in $\mathsf{FO}[\prec]$ ; Condition (2) is specified with a formula in $\mathsf{FOMod_{2}}[\prec]$ which, at every ‘ $\vartriangle$ ’ node $x$ , consider three of the paths that start at $x$ and end at a leaf, namely the unique path that traverses only ‘ $\vartriangle$ ’ nodes, through copies of the port $z$ , and that we call the a-path of $x$ , plus the a-path of each ‘ $\vartriangle$ ’ node that is a grandson of $x$ through its ‘ $\doteqdot$ ’ son. The formula verifies the parities of their lengths and the label of the leaves located at their end.

The reader can verify that the horizontal monoid $G$ of this algebra $\mathcal{G}=(G,W)$ , consists of $10$ elements: an identity, an absorbing element, and eight mutually accessible elements, two of which can be interpreted as ‘a $\vartriangle$ node with output $\mathsf{0}$ ’ and ‘a $\vartriangle$ node with output $\mathsf{1}$ ’; we denote these elements $\hat{\mathtt{0}}$ and $\hat{\mathtt{1}}$ , respectively. Using Section 3.3 it can be verified that $\leftrightarrow_{3,1}$ refines the canonical congruence of $\mathcal{G}$ .

5.4.1 Verifying non-definability with the extended algebra

One thing that places the Potthoff algebra outside of ${\mathbf{**}}\langle\mathbb{N}_{\tau,1}\rangle$ for any $\tau\geq 1$ , is that the vertical monoid $W_{\%}$ of its extended algebra $\mathcal{G}_{\%}$ is non-aperiodic. To see this, we define $B=\{b,b^{\prime}\}$ , $C_{b}=\{c_{0}\}$ and $C_{b^{\prime}}=\{c_{0}^{\prime},c_{1}^{\prime}\}$ ; the extension of $\varphi$ to $C$ is given by $\varphi(c_{0})=\varphi(c_{0}^{\prime})=\mathsf{0}$ and $\varphi(c_{1}^{\prime})=\mathsf{1}$ . Consider a labeling where $\nu(z)=b$ and $\nu(y_{0})=\nu(y_{1})=b^{\prime}$ and we do not care about $\psi$ . We define a sequence of tuples $t_{i}$ , $i\geq 1$ , where $t_{1}=(m,\nu,\psi)$ and every $t_{i}$ is obtained by inserting $t_{i-1}$ at the $z$ port of $t_{1}$ . The corresponding elements of $W_{\%}$ satisfy $\varphi_{\%}(\nabla(t_{i},z))\neq\varphi_{\%}(\nabla(t_{i+1},z))$ and $\varphi_{\%}(\nabla(t_{i},z))=\varphi_{\%}(\nabla(t_{i+2},z))$ for every $i\geq 1$ ; in particular, $\varphi_{\%}(\nabla(t_{i},z))$ maps $\{\mathsf{0}\}$ to $\{\mathsf{0},\bot\}$ if $i$ is even and to $\{\mathsf{1},\bot\}$ if $i$ is odd. In other words, the group $\mathbb{Z}_{2}$ divides the vertical monoid of $\mathcal{G}_{\%}$ . The labeling used here, where $\nu(y_{0})=\nu(y_{1})=b^{\prime}$ means that the images by $\varphi$ of the lateral subtrees are not taken into account, enables $\mathcal{G}_{\%}$ to test explicitly the parity of the length of the path between the $z$ port and the root, i.e. the a-path of $t_{i}$ .

5.4.2 Verifying non-definability with a recursive proof

Potthoff wrote his proof that $\mathcal{G}$ is not first-order in his doctoral thesis [19] but apparently did not publish it elsewhere. The proof described here is similar in its main lines; the differences come from the fact that Potthoff’s construction builds witnesses for the congruence defined by the first-order formulas of quantifier depth $n$ , whereas we are interested in witnesses for congruences of the form $\approx^{n}_{\tau,1}$ .

We recall that $A=\{\mathsf{0},\mathsf{1},\bot,\vartriangle,\doteqdot\}$ and we work with $B=\{b_{y},b_{z}\}$ , $J=\{\hat{\mathtt{0}},\hat{\mathtt{1}}\}$ , and a unique (up to labeling of the ports) multicontext $m$ , with $Y=\{y_{0},y_{1}\}=\mu^{-1}(b_{y})$ and $Z=\{z\}=\mu^{-1}(b_{z})$ . The proof consists in circuits that are pairs $\breve{\psi}(p_{\theta}^{(k)}),\breve{\psi}(p_{\theta+1}^{(k)})$ , for every $\sigma\geq\tau$ and appropriate integers $\theta$ and $k$ , such that $\breve{\psi}(p_{\theta}^{(k)})\,\langle\star\mathcal{J}^{\sigma,1}\rangle\,\breve{\psi}(p_{\theta+1}^{(k)})$ , $\{\varphi(p_{\theta}^{(k)},\psi)=\hat{\mathtt{0}}$ if $\theta$ is odd, and $\{\varphi(p_{\theta}^{(k)},\psi)=\hat{\mathtt{1}}$ if $\theta$ is even. The first step in the construction of $p_{\theta}^{(k)}$ and $p_{\theta+1}^{(k)}$ consists in pumping $m$ along $Z$ to create $m^{(\theta,Z)}$ and $m^{(\theta+1,Z)}$ , then inserting $\hat{\mathtt{1}}$ at the end of the a-paths, and denoting the resulting multicontexts $p_{\theta}^{(1)}$ and $p_{\theta+1}^{(1)}$ . An induction on $\theta$ shows that a suitable $\psi$ exists for every $\theta$ , for which $\varphi(p_{\theta}^{(1)},\psi)\neq\bot$ .

For every $k\geq 1$ , we build multicontexts $p_{\theta}^{(k+1)}$ and $p_{\theta+1}^{(k+1)}$ by inserting copies of $p_{\theta}^{(1)}$ and $p_{\theta+1}^{(1)}$ at the ports of $Y(p_{\theta}^{(k)})$ and $Y(p_{\theta+1}^{(k)})$ : if the port $y$ satisfies $\psi(y)=\hat{\mathtt{0}}$ , then $p_{\theta}^{(1)}$ is inserted if $\theta$ is odd (and $\varphi(p_{\theta}^{(1)},\psi)=\hat{\mathtt{0}}$ ), otherwise this is $p_{\theta+1}^{(1)}$ .

We define a coordinate system for every node and port of $p_{\theta}^{(k)}$ , $k\geq 1$ . With the convention that the root, a ‘ $\vartriangle$ ’ node, sits at depth level $1$ , we give to the ‘ $\vartriangle$ ’ node at depth level $d$ in $p_{\theta}^{(1)}$ and to its ‘ $\doteqdot$ ’ son the coordinates $a_{d}^{(\theta)}$ and $b_{d}^{(\theta)}$ , respectively. If we work on $p_{\theta}^{(1)}$ , i.e. no multicontext is inserted at the ports of $Y(p_{\theta}^{(1)})$ , then the sons of $b_{d}^{(\theta)}$ are the ports $y_{d0}^{(\theta)}$ and $y_{d1}^{(\theta)}$ . Otherwise, the two sons of $b_{d}^{(\theta)}$ are ‘ $\vartriangle$ ’ nodes; the one inserted at $y_{di}$ receives the coordinate $a_{di1}^{(\theta)}$ , its ‘ $\doteqdot$ ’ son $b_{di1}^{(\theta)}$ , and the two ports below it $y_{di10}^{(\theta)}$ and $y_{di11}^{(\theta)}$ . Along the a-path from $a_{di1}^{(\theta)}$ , the ‘ $\vartriangle$ ’ node at depth $e$ receives coordinate $a_{die}^{(\theta)}$ , its ‘ $\doteqdot$ ’ son the coordinate $b_{die}^{(\theta)}$ , and the ports below it $y_{die0}^{(\theta)}$ and $y_{die1}^{(\theta)}$ . In $p_{\theta}^{(3)}$ , there ports are replaced with ‘ $\vartriangle$ ’ nodes with coordinates $a_{die01}^{(\theta)}$ and $a_{die11}^{(\theta)}$ , etc.

We now look at two tuples $(p_{\theta}^{(k)},\psi)$ and $(p_{\theta+1}^{(k)},\psi)$ ; then we observe that at every depth $d\leq\theta$ along the a-paths of the roots, exactly one of $a_{d}^{(\theta)}$ and $a_{d}^{(\theta+1)}$ is an $a\mathsf{1}$ node; the other is an $a\mathsf{0}$ . When $a_{d}^{(\theta)}$ is an $a\mathsf{1}$ , then its son $b_{d}^{(\theta)}$ feeds it a $\hat{\mathtt{0}}$ , and this occurs exactly when $\psi(y_{d0}^{(\theta)})\neq\psi(y_{d1}^{(\theta)})$ if $k=1$ , and when only one of $a_{d01}^{(\theta)}$ and $a_{d11}^{(\theta)}$ is an $a\mathsf{1}$ node, if $k\geq 2$ . This is the pattern denoted $Q_{0}$ in Figure 5. Otherwise, $a_{d}^{(\theta)}$ is a $a\mathsf{0}$ node, $b_{d}^{(\theta)}$ feeds it a $\hat{\mathtt{1}}$ , and either $k=1$ and $y_{d0}^{(\theta)}$ and $y_{d1}^{(\theta)}$ carry the same label, or $k\geq 2$ and $a_{d01}^{(\theta)}$ and $a_{d11}^{(\theta)}$ are both $a\mathsf{0}$ or both $a\mathsf{1}$ nodes; these are the patterns $Q_{10}$ and $Q_{11}$ . Assume that $a_{d}^{(\theta)}$ is the root of a pattern $Q_{0}$ : we count those ports labelled with $\hat{\mathtt{0}}$ and those with $\hat{\mathtt{1}}$ , the result is a pair $(n_{d}^{(\theta,\mathsf{0})},n_{d}^{(\theta,\mathsf{1})})=(1,1)=(2^{0},2^{0})$ . The same work done on a $Q_{10}$ rooted at $a_{d}^{(\theta)}$ gives $(n_{d}^{(\theta+1,\mathsf{0})},n_{d}^{(\theta+1,\mathsf{1})})=(2,0)=(2^{0}+1,2^{0}-1)$ ; if this is a $Q_{11}$ , then $(n_{d}^{(\theta+1,\mathsf{0})},n_{d}^{(\theta+1,\mathsf{1})})=(0,2)=(2^{0}-1,2^{0}+1)$ , respectively. Observe that the numbers $n_{d}^{(\theta,\mathsf{0})}$ and $n_{d}^{(\theta+1,\mathsf{0})}$ , as well as $n_{d}^{(\theta,\mathsf{1})}$ and $n_{d}^{(\theta+1,\mathsf{1})}$ , differ by $1$ .

Assume that $a_{d}^{(\theta)}$ is an $a\mathsf{1}$ node; then for every depth $e$ , exactly one of $a_{d0e}^{(\theta)}$ and $a_{d1e}^{(\theta)}$ is an $a\mathsf{1}$ node and the root of a $Q_{0}$ pattern; the other is the root of a $Q_{10}$ or a $Q_{11}$ . When $k=2$ , the leaves of these patterns are the four ports $y_{d0e0}^{(\theta)}$ , $y_{d0e1}^{(\theta)}$ , $y_{d1e0}^{(\theta)}$ , and $y_{d1e1}^{(\theta)}$ ; their contexts within $p_{\theta}^{(2)}$ are equivalent under $\stackrel{{\scriptstyle\sigma,1}}{{\leftrightarrow}}$ , so that it makes sense to look at the pair of numbers obtained by counting the $\hat{\mathtt{0}}$ s and $\hat{\mathtt{1}}$ s at these ports; the result is

[TABLE]

Meanwhile, $a_{d}^{(\theta+1)}$ is an $a\mathsf{0}$ node, so that $\psi(y_{d0}^{(\theta+1)})=\psi(y_{d1}^{(\theta+1)})$ and at every depth $e$ , $a_{d0e}^{(\theta+1)}$ and $a_{d1e}^{(\theta+1)}$ are both $a\mathsf{0}$ nodes or both $a\mathsf{1}$ nodes. In the former case, the ports under their ‘ $\doteqdot$ ’ sons constitute two $Q_{0}$ patterns; otherwise, we can place a $Q_{10}$ under one of them and a $Q_{11}$ under the other. Either way, we obtain $n_{d0e}^{(\theta+1,2,\mathsf{0})}=n_{d1e}^{(\theta+1,2,\mathsf{1})}=(2,2)$ .

Iterating the process and building multicontexts $p_{\theta}^{(k)}$ and $p_{\theta+1}^{(k)}$ , it is possible to redefine $\psi$ on $Y(p_{\theta}^{(k)})$ and $Y(p_{\theta+1}^{(k)})$ in such a way that for every set of $2^{k}$ ports whose contexts are equivalent under $\stackrel{{\scriptstyle\sigma,1}}{{\leftrightarrow}}$ , the corresponding pairs of numbers $(n_{d\ldots}^{(\theta,k,\mathsf{0})},n_{d\ldots}^{(\theta,k,\mathsf{1})})$ and $(n_{d\ldots}^{(\theta+1,k,\mathsf{0})},n_{d\ldots}^{(\theta+1,k,\mathsf{1})})$ are $(2^{k-1},2^{k-1})$ below the $a\mathsf{0}$ node, and one of $(2^{k-1}+1,2^{k-1}-1)$ and $(2^{k-1}-1,2^{k-1}+1)$ below the $a\mathsf{1}$ node.

Consider two ports $y_{di}^{(\theta)}$ within $p_{\theta}^{(1)}$ and $y_{d^{\prime}j}^{(\theta+1)}$ within $p_{\theta+1}^{(1)}$ , where $i,j\in\{0,1\}$ : we have

[TABLE]

If $\theta\geq 3\sigma$ , then for every pair $d,i$ , the equivalence class of $\nabla(p_{\theta}^{(1)},y_{di}^{(\theta)})$ contains the contexts of at least four ports of $p_{\theta}^{(1)}$ and four of $p_{\theta+1}^{(1)}$ , and at least $4\sigma$ in the case where $\sigma<d<\theta-\sigma$ . In every case, for each class the numbers of ports labelled $\hat{\mathtt{0}}$ and of ports labelled $\hat{\mathtt{1}}$ are both at least $0=2^{0}-1$ . Applying the same reasoning to $p_{\theta}^{(k)}$ and $p_{\theta+1}^{(k)}$ , we find that every equivalence class that contains the context of a port of $Y(p_{\theta}^{(k)})$ contains at least $2^{k}$ such contexts, and the numbers of $\hat{\mathtt{0}}$ s and $\hat{\mathtt{1}}$ s are both at least $2^{k-1}-1$ . For every threshold $\tau$ and iteration level $n$ , one can take integers $\sigma$ and $k$ such that $\leftrightarrow_{\sigma,1}$ refines $\approx^{n+1}_{\tau,1}$ , and $k\geq\log_{2}\sigma$ , build the multicontexts $p_{\theta}^{(k)}$ and $p_{\theta+1}^{(k)}$ : by the above reasoning, a labelling $\psi$ can be defined on them, such that $(p_{\theta}^{(k)},\psi)\,\langle\star\mathcal{J}^{\sigma,1}\rangle\,(p_{\theta+1}^{(k)},\psi)$ . These tuples constitute a circuit $\mathcal{M}^{({n+1})}$ that satisfies $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,1})$ . Therefore, the Potthoff algebra does not belong to any variety ${\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,1}\rangle\!\rangle$ .

6 Conclusion

This research started with the intention of understanding the mechanisms that underlie the Ehrenfeucht-Fraïssé games used to prove that a forest algebra lies outside of ${\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ . Multicontexts and tuples, where ports have multiple labelings, are at the center of this proof technique, and the need for a description of how a forest algebra works on these objects led to the definition of the algebras $\mathcal{G}_{\#}$ and $\mathcal{G}_{\%}$ . In parallel to this came the observation that Ehrenfeucht-Fraïssé games are recursive and uniform; while games with these two properties always exist on monoids, word languages and linear orderings, it is not clear whether the same holds in the world of trees and forests. We could prove the existence of a recursive proof for every algebra outside of ${\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ , while our investigation of proof-by-pumping suggests has not solved the question of the existence of a uniform proof.

Given an algebra $\mathcal{G}=(G,W)$ , with $\sigma$ and $\rho$ the threshold and period of $\mathcal{G}_{\%}$ , Theorem 4.7 can be combined with Propositions 3.7 and 4.8 into this statement:

[TABLE]

or conversely,

[TABLE]

All examples of algebras outside of ${\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ known to the author satisfy the left hand side of (3), so that it makes sense to ask whether these implications are actually equivalences. If they are, then since satisfaction of $\mathbf{RC}(\mathcal{G}{\star}\mathcal{J}^{\sigma,\rho})$ is a recursively enumerable problem, membership of $\mathcal{G}$ in ${\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ is decidable, by running in parallel a search of an $n\geq 1$ such that $\mathcal{G}\prec\mathcal{H}^{n}_{\tau,\pi}$ , and a test for non-membership that consists in verifying the left hand side of (3), knowing that one of them will stop. Proving that $\mathbf{RC}(\mathcal{G}{\star}\mathcal{J}^{\sigma,\rho})$ is by itself decidable would of course yield a procedure more elegant than this.

Satisfaction of $\mathbf{RC}(\mathcal{G}{\star}\mathcal{J}^{\sigma,\rho})$ implies the existence of a recursive proof for ${\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ ; whether it also implies the existence of a uniform one is an open question. If there exist algebras outside of ${\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ that have only non-uniform proofs, then one would ask whether this property is decidable.

It is also worth noting that conversely, Theorem 4.7 suggests that if membership in ${\mathbf{**}}\langle\!\langle\mathbb{N}_{\tau,\pi}\rangle\!\rangle$ is undecidable, then this might be proved by showing that the existence of a recursive non-membership proof is undecidable.

Condition $\mathbf{RC}(\mathcal{G}{\star}\mathcal{H}^{n}_{\tau,\pi})$ gives a new viewpoint on the combinatorics of first-order languages; it involves counting under a threshold and a period, just as in the one developed in [10, 13], but in a nonlocal (and messy) way along an antichain of ports. Whether they can be deduced from one another remains to be seen.

The author thanks Howard Straubing, both for discussions about this research and for giving him access to unpublished material. Discussions with Andreas Krebs and Charles Paperman were also useful.

Bibliography30

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Benedikt, L. Ségoufin, Regular tree languages definable in 𝖥𝖮 𝖥𝖮 \mathsf{FO} and in 𝖥𝖮 𝗆𝗈𝖽 subscript 𝖥𝖮 𝗆𝗈𝖽 \mathsf{FO}_{\mathsf{mod}} , ACM Trans. Comput. Log. 11 (2009) Issue 1, Article 4.
2[2] M. Bojańczyk, L. Ségoufin, Tree Languages Defined in First-Order Logic with One Quantifier Alternation Proc. of the 35th International Colloquium on Automata, Languages and Programming , Lecture Notes in Comp. Sci. 5126 , Springer-Verlag (2008), pp. 233-245.
3[3] M. Bojańczyk, L. Ségoufin, H. Straubing, Piecewise testable tree languages, Proc. of the 23th IEEE Symp. on Logics in Computer Science (2008), pp. 442-451.
4[4] M. Bojańczyk, H. Straubing, I. Walukiewicz, Wreath Products of Forest Algebras, with Application to Tree Logics, Logical Methods in Computer Science 8 4:03 (2012), pp. 1-39.
5[5] M. Bojańczyk, H. Straubing, I. Walukiewicz, Varieties of Forest Algebras, Unpublished manuscript (2012).
6[6] M. Bojańczyk, I. Walukiewicz, Forest Algebras, in Logic and automata, History and Perspectives, E.J.Flum and T.Wilke eds, Amsterdam University Press, 2008.
7[7] S. Burris, H. Sankappanavar, A Course in Universal Algebra , Millenium Edition (2000).
8[8] Z.Ésik, P.Weil, Algebraic recognizability of regular tree languages, Theoretical Computer Science 340 (2005), pp. 291-321.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Proving that a Tree Language is not First-Order Definable

Abstract

1 Introduction

2 Definitions and Background

2.1 Forests, Multicontexts, and Circuits

2.2 Forest algebras

Definition 2.1

2.3 Block product congruences

Proposition 2.2

3 Algebras for Multicontexts

3.1 Multicontexts

3.2 The algebra of mappings

Proposition 3.1

3.3 Equivalence under pumping

Proposition 3.2

Proposition 3.3

3.4 The extended algebra

Proposition 3.4

Proposition 3.5

Proposition 3.6

Proposition 3.7

4 Recursive Proofs

4.1 ≈τ,πN\approx_{\tau,\pi}^{\mathbb{N}}≈τ,πN​-tight sets

Proposition 4.1

Proposition 4.2

Proposition 4.3

4.2 Recursive proofs for non-membership

Definition 4.4

Lemma 4.5

Lemma 4.6

Theorem 4.7

4.3 Uniform proofs

Proposition 4.8

4.3.1 Proof-by-copy

4.3.2 Proof-by-pumping

5 Examples of proofs

5.1 The Boolean algebra

5.2 The duplex Boolean formulas

5.3 Forest algebras with aperiodic vertical monoid

5.4 The Potthoff Algebra

5.4.1 Verifying non-definability with the extended algebra

5.4.2 Verifying non-definability with a recursive proof

6 Conclusion

4.1 $\approx_{\tau,\pi}^{\mathbb{N}}$ -tight sets