Linear Bounded Composition of Tree-Walking Tree Transducers: Linear Size   Increase and Complexity

Joost Engelfriet; Kazuhiro Inaba; Sebastian Maneth

arXiv:1904.09203·cs.FL·December 13, 2019

Linear Bounded Composition of Tree-Walking Tree Transducers: Linear Size Increase and Complexity

Joost Engelfriet, Kazuhiro Inaba, Sebastian Maneth

PDF

TL;DR

This paper proves that compositions of tree-walking tree transducers can be realized with linear size intermediate results, impacting their expressiveness and computational complexity.

Contribution

It introduces the concept of linear bounded composition for tree-walking transducers and analyzes the complexity and expressiveness of their compositions.

Findings

01

Compositions can be realized with intermediate results of linear size.

02

Linear size increase functions can be realized by a single deterministic transducer.

03

Deterministic compositions are computable in linear time and space; nondeterministic ones in polynomial time and linear space.

Abstract

Compositions of tree-walking tree transducers form a hierarchy with respect to the number of transducers in the composition. As main technical result it is proved that any such composition can be realized as a linear bounded composition, which means that the sizes of the intermediate results can be chosen to be at most linear in the size of the output tree. This has consequences for the expressiveness and complexity of the translations in the hierarchy. First, if the computed translation is a function of linear size increase, i.e., the size of the output tree is at most linear in the size of the input tree, then it can be realized by just one, deterministic, tree-walking tree transducer. For compositions of deterministic transducers it is decidable whether or not the translation is of linear size increase. Second, every composition of deterministic transducers can be computed in…

Equations60

T_{1} \circ (T_{2} * T_{3}) \subseteq (T_{1} \circ T_{2}) * T_{3} and (T_{1} * T_{2}) * T_{3} \subseteq T_{1} * (T_{2} \circ T_{3}) .

T_{1} \circ (T_{2} * T_{3}) \subseteq (T_{1} \circ T_{2}) * T_{3} and (T_{1} * T_{2}) * T_{3} \subseteq T_{1} * (T_{2} \circ T_{3}) .

T_{Σ}^{∙} = {(t, u) ∣ t \in T_{Σ}, u \in N (t)} .

T_{Σ}^{∙} = {(t, u) ∣ t \in T_{Σ}, u \in N (t)} .

T (L) = {(t, u) \in T_{Σ}^{∙} ∣ t ∣_{u} \in L}

T (L) = {(t, u) \in T_{Σ}^{∙} ∣ t ∣_{u} \in L}

\begin{array}[]{ll}{\rm stay},&\\ {\rm up}&\text{provided }j\neq 0,\text{ and}\\ {\rm down}_{i}&\text{with }1\leq i\leq\operatorname{rank}_{\Sigma}(\sigma).\end{array}

\begin{array}[]{ll}{\rm stay},&\\ {\rm up}&\text{provided }j\neq 0,\text{ and}\\ {\rm down}_{i}&\text{with }1\leq i\leq\operatorname{rank}_{\Sigma}(\sigma).\end{array}

\begin{array}[]{llllll}\langle d,\sigma,j^{\prime},T^{\mathrm{c}}\rangle&\to&\langle d,{\rm down}_{1}\rangle&\langle d,e,j\rangle&\to&\langle u_{j},{\rm up}\rangle\\[1.13809pt] \langle d,\sigma,j^{\prime},T\rangle&\to&\sigma(\langle p,{\rm stay}\rangle,\langle d,{\rm down}_{1}\rangle)&\langle d,e,0\rangle&\to&e\\[5.69054pt] \langle u_{1},\sigma,j^{\prime}\rangle&\to&\langle d,{\rm down}_{2}\rangle&\langle p,\tau,j\rangle&\to&\sigma(\langle p,{\rm up}\rangle,j)\\[1.13809pt] \langle u_{2},\sigma,j\rangle&\to&\langle u_{j},{\rm up}\rangle&\langle p,\tau,0\rangle&\to&e\\[1.13809pt] \langle u_{2},\sigma,0\rangle&\to&e&&&\end{array}

\begin{array}[]{llllll}\langle d,\sigma,j^{\prime},T^{\mathrm{c}}\rangle&\to&\langle d,{\rm down}_{1}\rangle&\langle d,e,j\rangle&\to&\langle u_{j},{\rm up}\rangle\\[1.13809pt] \langle d,\sigma,j^{\prime},T\rangle&\to&\sigma(\langle p,{\rm stay}\rangle,\langle d,{\rm down}_{1}\rangle)&\langle d,e,0\rangle&\to&e\\[5.69054pt] \langle u_{1},\sigma,j^{\prime}\rangle&\to&\langle d,{\rm down}_{2}\rangle&\langle p,\tau,j\rangle&\to&\sigma(\langle p,{\rm up}\rangle,j)\\[1.13809pt] \langle u_{2},\sigma,j\rangle&\to&\langle u_{j},{\rm up}\rangle&\langle p,\tau,0\rangle&\to&e\\[1.13809pt] \langle u_{2},\sigma,0\rangle&\to&e&&&\end{array}

⟨ p, τ, j ⟩ \to σ (⟨ p, up ⟩, ⟨ p^{'}, stay ⟩) and ⟨ p^{'}, τ, j ⟩ \to j .

⟨ p, τ, j ⟩ \to σ (⟨ p, up ⟩, ⟨ p^{'}, stay ⟩) and ⟨ p^{'}, τ, j ⟩ \to j .

\begin{array}[]{lll}\langle d,\sigma,j^{\prime}\rangle&\to&\langle d,{\rm down}_{1}\rangle\\[1.13809pt] \langle d,e,j\rangle&\to&\sigma(\langle u_{j},{\rm up}\rangle,\langle u_{j},{\rm up}\rangle)\\[1.13809pt] \langle d,e,0\rangle&\to&\sigma(e,e)\end{array}

\begin{array}[]{lll}\langle d,\sigma,j^{\prime}\rangle&\to&\langle d,{\rm down}_{1}\rangle\\[1.13809pt] \langle d,e,j\rangle&\to&\sigma(\langle u_{j},{\rm up}\rangle,\langle u_{j},{\rm up}\rangle)\\[1.13809pt] \langle d,e,0\rangle&\to&\sigma(e,e)\end{array}

⟨ p, σ, j, T ⟩ \to ⟨ σ, T ⟩ (⟨ p, down_{1} ⟩, \dots, ⟨ p, down_{m} ⟩)

⟨ p, σ, j, T ⟩ \to ⟨ σ, T ⟩ (⟨ p, down_{1} ⟩, \dots, ⟨ p, down_{m} ⟩)

⟨ q, σ, j, T \cap inv_{q} (T^{'})⟩ \to ⟨ δ, T^{'} ⟩ (⟨ q_{1}, α_{1} ⟩, \dots, ⟨ q_{k}, α_{k} ⟩)

⟨ q, σ, j, T \cap inv_{q} (T^{'})⟩ \to ⟨ δ, T^{'} ⟩ (⟨ q_{1}, α_{1} ⟩, \dots, ⟨ q_{k}, α_{k} ⟩)

\mbox{\sf dTT${}_{\downarrow}$}\circ\mbox{\sf dTT${}^{\ell}_{\downarrow}$}\subseteq\mbox{\sf dTT${}_{\downarrow}$}\text{\quad and \quad}\mbox{\sf dTT${}_{\mathrm{pru}}$}\circ\mbox{\sf dTT${}^{\ell}_{\mathrm{pru}}$}\subseteq\mbox{\sf dTT${}_{\mathrm{pru}}$}.

\mbox{\sf dTT${}_{\downarrow}$}\circ\mbox{\sf dTT${}^{\ell}_{\downarrow}$}\subseteq\mbox{\sf dTT${}_{\downarrow}$}\text{\quad and \quad}\mbox{\sf dTT${}_{\mathrm{pru}}$}\circ\mbox{\sf dTT${}^{\ell}_{\mathrm{pru}}$}\subseteq\mbox{\sf dTT${}_{\mathrm{pru}}$}.

\mbox{\sf TT${}_{\downarrow}$}\circ\mbox{\sf TT${}^{\ell}_{\mathrm{pru}}$}\subseteq\mbox{\sf TT${}_{\downarrow}$}\text{\quad and \quad}\mbox{\sf TT${}_{\mathrm{pru}}$}\circ\mbox{\sf TT${}^{\ell}_{\mathrm{pru}}$}\subseteq\mbox{\sf TT${}_{\mathrm{pru}}$}.

\mbox{\sf TT${}_{\downarrow}$}\circ\mbox{\sf TT${}^{\ell}_{\mathrm{pru}}$}\subseteq\mbox{\sf TT${}_{\downarrow}$}\text{\quad and \quad}\mbox{\sf TT${}_{\mathrm{pru}}$}\circ\mbox{\sf TT${}^{\ell}_{\mathrm{pru}}$}\subseteq\mbox{\sf TT${}_{\mathrm{pru}}$}.

⟨ p, σ, j ⟩ \to σ (⟨ p_{1}, down_{1} ⟩, \dots, ⟨ p_{m}, down_{m} ⟩)

⟨ p, σ, j ⟩ \to σ (⟨ p_{1}, down_{1} ⟩, \dots, ⟨ p_{m}, down_{m} ⟩)

\mbox{\sf dTT${}_{\mathrm{su}}$}\circ\mbox{\sf dTT}\subseteq\mbox{\sf dTT${}_{\mathrm{su}}$}\circ\mbox{\sf dTT${}^{\ell}_{\mathrm{rel}}$}\circ\mbox{\sf dTT${}^{\ell}$}.

\mbox{\sf dTT${}_{\mathrm{su}}$}\circ\mbox{\sf dTT}\subseteq\mbox{\sf dTT${}_{\mathrm{su}}$}\circ\mbox{\sf dTT${}^{\ell}_{\mathrm{rel}}$}\circ\mbox{\sf dTT${}^{\ell}$}.

\mbox{\sf mrMT${}_{\text{{\sc io}}}$}\subseteq\mbox{\sf dTT${}^{\ell}_{\downarrow}$}\circ\mbox{\sf f\,TT${}^{\ell}_{\downarrow}$}\circ\mbox{\sf dTT}\circ\mbox{\sf dTT${}_{\downarrow}$}

\mbox{\sf mrMT${}_{\text{{\sc io}}}$}\subseteq\mbox{\sf dTT${}^{\ell}_{\downarrow}$}\circ\mbox{\sf f\,TT${}^{\ell}_{\downarrow}$}\circ\mbox{\sf dTT}\circ\mbox{\sf dTT${}_{\downarrow}$}

\begin{array}[]{lll}\mbox{\sf TT}\circ\mbox{\sf TT}^{k}&\subseteq&\mbox{\sf TT}\circ(\mbox{\sf TT${}_{\mathrm{pru}}$}\ast\mbox{\sf TT}^{k})\\ &\subseteq&(\mbox{\sf TT}\circ\mbox{\sf TT${}_{\mathrm{pru}}$})\ast\mbox{\sf TT}^{k}\\ &\subseteq&\mbox{\sf TT}\ast\mbox{\sf TT}^{k}\\ &\subseteq&(\mbox{\sf TT${}_{\mathrm{pru}}$}\ast\mbox{\sf TT})\ast\mbox{\sf TT}^{k}\\ &\subseteq&\mbox{\sf TT${}_{\mathrm{pru}}$}\ast(\mbox{\sf TT}\circ\mbox{\sf TT}^{k})\end{array}

\begin{array}[]{lll}\mbox{\sf TT}\circ\mbox{\sf TT}^{k}&\subseteq&\mbox{\sf TT}\circ(\mbox{\sf TT${}_{\mathrm{pru}}$}\ast\mbox{\sf TT}^{k})\\ &\subseteq&(\mbox{\sf TT}\circ\mbox{\sf TT${}_{\mathrm{pru}}$})\ast\mbox{\sf TT}^{k}\\ &\subseteq&\mbox{\sf TT}\ast\mbox{\sf TT}^{k}\\ &\subseteq&(\mbox{\sf TT${}_{\mathrm{pru}}$}\ast\mbox{\sf TT})\ast\mbox{\sf TT}^{k}\\ &\subseteq&\mbox{\sf TT${}_{\mathrm{pru}}$}\ast(\mbox{\sf TT}\circ\mbox{\sf TT}^{k})\end{array}

τ_{N} \circ τ_{M^{'}} \subseteq τ_{M} \subseteq τ_{N} \circ τ_{M^{'}}^{0} .

τ_{N} \circ τ_{M^{'}} \subseteq τ_{M} \subseteq τ_{N} \circ τ_{M^{'}}^{0} .

τ_{N} \circ τ_{M^{'}} \subseteq τ_{M} and τ_{M}^{0} \subseteq τ_{N} \circ τ_{M^{'}}^{01} .

τ_{N} \circ τ_{M^{'}} \subseteq τ_{M} and τ_{M}^{0} \subseteq τ_{N} \circ τ_{M^{'}}^{01} .

⟨ p, σ, j, T ⟩ \to ⟨ σ, (i_{1}, \dots, i_{n}), γ ⟩ (⟨ p, down_{i_{1}} ⟩, \dots, ⟨ p, down_{i_{n}} ⟩)

⟨ p, σ, j, T ⟩ \to ⟨ σ, (i_{1}, \dots, i_{n}), γ ⟩ (⟨ p, down_{i_{1}} ⟩, \dots, ⟨ p, down_{i_{n}} ⟩)

⟨ q_{1}, u_{1} ⟩ \Rightarrow_{M, t} ⟨ q_{2}, u_{2} ⟩ \Rightarrow_{M, t} \dots \Rightarrow_{M, t} ⟨ q_{m}, u_{m} ⟩,

⟨ q_{1}, u_{1} ⟩ \Rightarrow_{M, t} ⟨ q_{2}, u_{2} ⟩ \Rightarrow_{M, t} \dots \Rightarrow_{M, t} ⟨ q_{m}, u_{m} ⟩,

⟨ p_{j}, σ, j^{'}, T ⟩ \to ⟨ σ, j, U, γ ⟩ (⟨ p_{1}, down_{1} ⟩, \dots, ⟨ p_{m}, down_{m} ⟩)

⟨ p_{j}, σ, j^{'}, T ⟩ \to ⟨ σ, j, U, γ ⟩ (⟨ p_{1}, down_{1} ⟩, \dots, ⟨ p_{m}, down_{m} ⟩)

⟨ q, σ_{1}, j_{1}, T_{1}, \dots, σ_{ℓ}, j_{ℓ}, T_{ℓ}, E ⟩ \to ⟨ q^{'}, α_{1}, \dots, α_{ℓ} ⟩

⟨ q, σ_{1}, j_{1}, T_{1}, \dots, σ_{ℓ}, j_{ℓ}, T_{ℓ}, E ⟩ \to ⟨ q^{'}, α_{1}, \dots, α_{ℓ} ⟩

\mbox TT^{k} (\mbox REGT) \subseteq \mbox DSPACE (n) and \mbox MT^{k} (\mbox REGT) \subseteq \mbox DSPACE (n) .

\mbox TT^{k} (\mbox REGT) \subseteq \mbox DSPACE (n) and \mbox MT^{k} (\mbox REGT) \subseteq \mbox DSPACE (n) .

y \mbox TT^{k} (\mbox REGT) \subseteq \mbox DSPACE (n) and y \mbox MT^{k} (\mbox REGT) \subseteq \mbox DSPACE (n)

y \mbox TT^{k} (\mbox REGT) \subseteq \mbox DSPACE (n) and y \mbox MT^{k} (\mbox REGT) \subseteq \mbox DSPACE (n)

\begin{array}[]{llllll}\langle q_{i\vee j},d\rangle&\to&\vee(\langle q_{i},\alpha\rangle,\langle q_{j},\alpha\rangle)&\langle q_{i},c\rangle&\to&\mathsf{v}(\langle q_{i},\alpha\rangle)\\[1.13809pt] \langle q_{i\wedge j},d\rangle&\to&\wedge(\langle q_{i},\alpha\rangle,\langle q_{j},\alpha\rangle)&\langle q_{i},j\rangle&\to&\mathsf{v}(\langle q_{i},\alpha\rangle)\\[1.13809pt] \langle q_{\neg\,i},d\rangle&\to&\neg(\langle q_{i},\alpha\rangle)&\langle q_{i},i\rangle&\to&\mathsf{e}\\[1.13809pt] \langle q_{i},d\rangle&\to&\langle q_{i},\alpha\rangle&&&\end{array}

\begin{array}[]{llllll}\langle q_{i\vee j},d\rangle&\to&\vee(\langle q_{i},\alpha\rangle,\langle q_{j},\alpha\rangle)&\langle q_{i},c\rangle&\to&\mathsf{v}(\langle q_{i},\alpha\rangle)\\[1.13809pt] \langle q_{i\wedge j},d\rangle&\to&\wedge(\langle q_{i},\alpha\rangle,\langle q_{j},\alpha\rangle)&\langle q_{i},j\rangle&\to&\mathsf{v}(\langle q_{i},\alpha\rangle)\\[1.13809pt] \langle q_{\neg\,i},d\rangle&\to&\neg(\langle q_{i},\alpha\rangle)&\langle q_{i},i\rangle&\to&\mathsf{e}\\[1.13809pt] \langle q_{i},d\rangle&\to&\langle q_{i},\alpha\rangle&&&\end{array}

\begin{array}[]{llllll}\langle q,a,0\rangle&\to&a(\langle q_{0},\alpha\rangle,\langle q_{1},\alpha\rangle)&\langle p,d,1\rangle&\to&d(\langle p,\alpha\rangle)\\[1.13809pt] \langle q_{i},b,1\rangle&\to&i(\langle q_{0},\alpha\rangle,\langle q_{1},\alpha\rangle)&\langle p,e,1\rangle&\to&e\\[1.13809pt] \langle q_{i},c,1\rangle&\to&c(\langle p,\alpha\rangle)&&&\end{array}

\begin{array}[]{llllll}\langle q,a,0\rangle&\to&a(\langle q_{0},\alpha\rangle,\langle q_{1},\alpha\rangle)&\langle p,d,1\rangle&\to&d(\langle p,\alpha\rangle)\\[1.13809pt] \langle q_{i},b,1\rangle&\to&i(\langle q_{0},\alpha\rangle,\langle q_{1},\alpha\rangle)&\langle p,e,1\rangle&\to&e\\[1.13809pt] \langle q_{i},c,1\rangle&\to&c(\langle p,\alpha\rangle)&&&\end{array}

\begin{array}[]{lll}\langle q_{0},\gamma,j\rangle&\to&\gamma(\langle q_{1},{\rm down}_{2}\rangle,\dots,\langle q_{k},{\rm down}_{2}\rangle)\\[1.13809pt] \langle q_{1},\omega,2\rangle&\to&\langle q_{0},{\rm down}_{1}\rangle\\[1.13809pt] \langle q_{i},\omega,2\rangle&\to&\langle q_{i-1},{\rm down}_{2}\rangle\end{array}

\begin{array}[]{lll}\langle q_{0},\gamma,j\rangle&\to&\gamma(\langle q_{1},{\rm down}_{2}\rangle,\dots,\langle q_{k},{\rm down}_{2}\rangle)\\[1.13809pt] \langle q_{1},\omega,2\rangle&\to&\langle q_{0},{\rm down}_{1}\rangle\\[1.13809pt] \langle q_{i},\omega,2\rangle&\to&\langle q_{i-1},{\rm down}_{2}\rangle\end{array}

\begin{array}[]{llllll}\langle d,@,j^{\prime}\rangle&\to&\langle d,{\rm down}_{1}\rangle&\langle d,e,j\rangle&\to&\langle u_{j},{\rm up}\rangle\\[1.13809pt] \langle d,\delta,j\rangle&\to&\delta(\langle d,{\rm down}_{1}\rangle,\langle u_{j},{\rm up}\rangle)&\langle d,e,0\rangle&\to&e\\[1.13809pt] \langle d,\delta,0\rangle&\to&\delta(\langle d,{\rm down}_{1}\rangle,e)&&&\\[5.69054pt] \langle u_{1},@,j^{\prime}\rangle&\to&\langle d,{\rm down}_{2}\rangle&\langle u_{1},\delta,j^{\prime}\rangle&\to&e\\[1.13809pt] \langle u_{2},@,j\rangle&\to&\langle u_{j},{\rm up}\rangle&&&\\[1.13809pt] \langle u_{2},@,0\rangle&\to&e&&&\end{array}

\begin{array}[]{llllll}\langle d,@,j^{\prime}\rangle&\to&\langle d,{\rm down}_{1}\rangle&\langle d,e,j\rangle&\to&\langle u_{j},{\rm up}\rangle\\[1.13809pt] \langle d,\delta,j\rangle&\to&\delta(\langle d,{\rm down}_{1}\rangle,\langle u_{j},{\rm up}\rangle)&\langle d,e,0\rangle&\to&e\\[1.13809pt] \langle d,\delta,0\rangle&\to&\delta(\langle d,{\rm down}_{1}\rangle,e)&&&\\[5.69054pt] \langle u_{1},@,j^{\prime}\rangle&\to&\langle d,{\rm down}_{2}\rangle&\langle u_{1},\delta,j^{\prime}\rangle&\to&e\\[1.13809pt] \langle u_{2},@,j\rangle&\to&\langle u_{j},{\rm up}\rangle&&&\\[1.13809pt] \langle u_{2},@,0\rangle&\to&e&&&\end{array}

\mbox{\sf dFT}_{@}^{k}\cap\mbox{\sf LSIF}=\mbox{\sf enc}\circ\mbox{\sf dTT${}_{\mathrm{su}}$}\circ\mbox{\sf dec}\subseteq\mbox{\sf dFT}\subseteq\mbox{\sf dFT}_{@}.

\mbox{\sf dFT}_{@}^{k}\cap\mbox{\sf LSIF}=\mbox{\sf enc}\circ\mbox{\sf dTT${}_{\mathrm{su}}$}\circ\mbox{\sf dec}\subseteq\mbox{\sf dFT}\subseteq\mbox{\sf dFT}_{@}.

\mbox{\sf dMFT}_{@}^{k}\cap\mbox{\sf LSIF}=\mbox{\sf enc}\circ\mbox{\sf dTT${}_{\mathrm{su}}$}\circ\mbox{\sf dec}\subseteq\mbox{\sf dFT}\subseteq\mbox{\sf dFT}_{@}\subseteq\mbox{\sf dMFT}_{@}.

\mbox{\sf dMFT}_{@}^{k}\cap\mbox{\sf LSIF}=\mbox{\sf enc}\circ\mbox{\sf dTT${}_{\mathrm{su}}$}\circ\mbox{\sf dec}\subseteq\mbox{\sf dFT}\subseteq\mbox{\sf dFT}_{@}\subseteq\mbox{\sf dMFT}_{@}.

\begin{array}[]{lll}\langle p,j,@\rangle&\to&@(\langle p,{\rm down}_{1}\rangle,\langle p,{\rm down}_{2}\rangle)\\[1.13809pt] \langle p,j,e\rangle&\to&\lambda\\[1.13809pt] \langle p,j,\delta\rangle&\to&\omega(\delta,[\,,\langle p,{\rm down}_{1}\rangle,]\,)\end{array}

\begin{array}[]{lll}\langle p,j,@\rangle&\to&@(\langle p,{\rm down}_{1}\rangle,\langle p,{\rm down}_{2}\rangle)\\[1.13809pt] \langle p,j,e\rangle&\to&\lambda\\[1.13809pt] \langle p,j,\delta\rangle&\to&\omega(\delta,[\,,\langle p,{\rm down}_{1}\rangle,]\,)\end{array}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Linear-Bounded Composition of

Tree-Walking Tree Transducers:

Linear Size Increase and Complexity111Published at https://link.springer.com/article/10.1007/s00236-019-00360-8

Joost Engelfriet LIACS, Leiden University, P.O. Box 9512, 2300 RA Leiden, the Netherlands; email: [email protected]

Kazuhiro Inaba Google Japan G.K., Tokyo, Japan; email: [email protected]

Sebastian Maneth Department of Mathematics and Informatics, Universität Bremen, P.O. Box 330 440, 28334 Bremen, Germany; email: [email protected]

Abstract

Compositions of tree-walking tree transducers form a hierarchy with respect to the number of transducers in the composition. As main technical result it is proved that any such composition can be realized as a linear-bounded composition, which means that the sizes of the intermediate results can be chosen to be at most linear in the size of the output tree. This has consequences for the expressiveness and complexity of the translations in the hierarchy. First, if the computed translation is a function of linear size increase, i.e., the size of the output tree is at most linear in the size of the input tree, then it can be realized by just one, deterministic, tree-walking tree transducer. For compositions of deterministic transducers it is decidable whether or not the translation is of linear size increase. Second, every composition of deterministic transducers can be computed in deterministic linear time on a RAM and in deterministic linear space on a Turing machine, measured in the sum of the sizes of the input and output tree. Similarly, every composition of nondeterministic transducers can be computed in simultaneous polynomial time and linear space on a nondeterministic Turing machine. Their output tree languages are deterministic context-sensitive, i.e., can be recognized in deterministic linear space on a Turing machine. The membership problem for compositions of nondeterministic translations is nondeterministic polynomial time and deterministic linear space. All the above results also hold for compositions of macro tree transducers. The membership problem for the composition of a nondeterministic and a deterministic tree-walking tree translation (for a nondeterministic IO macro tree translation) is log-space reducible to a context-free language, whereas the membership problem for the composition of a deterministic and a nondeterministic tree-walking tree translation (for a nondeterministic OI macro tree translation) is possibly NP-complete.

1 Introduction
2 Preliminaries
3 Tree-Walking Tree Transducers
4 Regular Look-Around
5 Composition
6 Macro and MSO
6.1 Macro Tree Transducers
6.2 MSO Tree Transducers
7 Functional Nondeterminism
8 Productivity
8.1 Nondeterministic Productivity
8.2 Deterministic Productivity
9 Linear Size Increase
10 Deterministic Complexity
11 Nondeterministic Complexity
12 Translation Complexity
13 Forest Transducers
14 Conclusion

1 Introduction

Tree transducers are used, e.g., in compiler theory or, more generally, the theory of syntax-directed semantics of context-free languages [39], and in the theory of XML queries and XML document transformation [69, 47]. One of the most basic types of tree transducer is the top-down tree transducer (in short tt↓). It is a finite-state device that walks top-down on the input tree, from parent to child, possibly branching into parallel copies of itself at each step (thus allowing the transducer to visit all children of the parent). During this process, the output tree is generated top-down. The tt↓ has been generalized in two different ways. By allowing it to walk also bottom-up, from child to parent, still possibly branching at every step and still generating the output tree top-down, one obtains the tree-walking tree transducer (in short, tt).222The name “tree-walking tree transducer” was introduced in [26]. The adjective “tree-walking” stands for the fact that the transducer walks on the input tree (just as the tree-walking automaton of [2]). The tt is the generalization to trees of the two-way finite-state string transducer, which walks on its input string in both directions and produces the output string one-way from left to right. Note that “tree-walking” and “two-way” alliterate.

On the other hand, restricting its walk to be top-down but allowing its states to have parameters of type output tree, one obtains the macro tree transducer (in short, mt). In general we consider nondeterministic transducers, with deterministic transducers as an important special case (abbreviated as dtt↓, dtt, and dmt).

To turn the tt↓ into a more flexible model of tree transformation, it was enhanced with the feature of regular look-ahead, which means that it can test whether or not the subtree at the current node of the input tree belongs to a given regular tree language. The mt already has the ability to implement regular look-ahead. Since both the enhanced tt↓ and the mt process the input tree top-down, they can also implement “regular look-around”, which means that they can test arbitrary regular properties of the current node of the input tree. More precisely, they can test whether the input tree, in which the current node is marked, belongs to a given regular tree language. Such regular look-around tests are also called mso tests, because they can be expressed by formulas of monadic second-order logic with one free node variable. The tt, as defined in [63], does not have regular look-ahead or look-around. One of the drawbacks of this is that the tt cannot recognize all regular tree languages without branching [9]. Hence, from now on, we assume that the tt (and the tt↓) is enhanced with regular look-around, i.e., with regular tests of the current input node. The resulting tt formalism is a quite robust, flexible, and intuitive model of tree transformation.

The tt and mt, generalizations of the tt↓, are closely related, in particular in the deterministic case. In fact, every dtt can be simulated by a dmt, whereas every dmt can be simulated by a composition of two dtt’s. Thus, every composition of dtt’s can be realized by a composition of dmt’s, and vice versa. Compositions of dtt’s form a proper hierarchy, in an obvious way. A single dtt is at most of exponential size increase, which means that the size of the output tree is at most exponential in the size of the input tree. However, a composition of two dtt’s can be of double exponential size increase. In general, compositions of $k$ dtt’s are at most, and can be, of $k$ -fold exponential size increase. Compositions of dmt’s form a proper hierarchy by a similar argument. For nondeterministic tt’s and mt’s the situation is similar but more complicated. Every mt can be simulated by a composition of two tt’s. However, as opposed to tt’s, mt’s are always finitary, which means that for every given input tree an mt computes finitely many output trees.

In this paper we investigate compositions of tt’s (and hence of mt’s) with respect to their expressivity and their complexity. Our main technical result is that every composition of tt’s can be realized by a linear-bounded composition of tt’s, which means that, when computing an output tree from an input tree, the intermediate results can be chosen in such a way that their sizes are at most linear in the size of the output tree. More precisely, a composition of two transducers (for simplicity) is linear-bounded if there is a constant $c$ such that for every pair $(t,s)$ of an input tree $t$ and output tree $s$ in the composed translation there is an intermediate tree $r$ (meaning that $(t,r)$ and $(r,s)$ are in the first and second translation, respectively) such that the size of $r$ is at most $c$ times the size of $s$ . Intuitively, to compute $s$ from $t$ there is no need to consider intermediate results that are much larger than $s$ . If both transducers are deterministic it means that for every input tree $t$ in the domain of the composed translation the size of the unique intermediate tree $r$ is at most linear in the size of the unique output tree $s$ .

To prove that every composition of two tt’s can be realized by a linear-bounded composition of two tt’s, we first show that every tt can be decomposed into a tt↓ that “prunes” the input tree, followed by a tt that is “productive” on at least one of the intermediate trees generated by the tt↓, which means that it uses each leaf and each monadic node of that intermediate tree in order to generate the output tree. Productivity guarantees that the composition of these two transducers is linear-bounded. We also prove that the composition of an arbitrary tt with a “pruning” top-down tt can be realized by one tt. Thus, when two tt’s are composed, the second tt can split off the pruning tt↓ (to the left), which can be absorbed (to the right) by the first tt. The composition of the resulting two tt’s is then linear-bounded. This also holds for deterministic transducers, in which case the pruning tt is also deterministic. Similar results were presented for macro tree transducers in [58, Section 3] and [51, Section 4].

Thus, roughly speaking, our main technical result provides a method to implement compositions of tt’s in such a way that the generation of superfluous nodes, i.e., nodes on which a tt just walks around without producing any output, is avoided by pruning those superfluous parts from the intermediate trees. As such it can be viewed as a static garbage collection procedure, and leads, in principle, to algorithms for automatic compiler and XML query optimization. Since tt’s are essentially finite-state automata walking on trees, it is not really surprising that only a linearly bounded amount of intermediate information is useful to the final output. However, proving this rigorously requires quite some effort. In particular, the subcomputations of the tt during which it does not produce output will be determined by regular look-around.

The above method can be used to obtain results on both the expressivity and the complexity of compositions of tt’s, as discussed in the next paragraphs.

Expressivity. We have seen above that compositions of tt’s can be of $k$ -fold exponential size increase. However, many real world tree transformations are of linear size increase. We prove that the hierarchy of compositions of deterministic tt’s collapses when restricted to translations of linear size increase: every composition of dtt’s that is of linear size increase can be realized by just one dtt. We also show that it is decidable whether or not a composition of dtt’s is of linear size increase. This means that a compiler or XML query, no matter how inefficiently programmed in several phases, can be realized in one efficient phase, provided it is of linear size increase. In fact, as we will see below, that single phase can be executed in linear time. More theoretically, we additionally prove that a function that can be realized by a composition of nondeterministic tt’s, can also be realized by a composition of deterministic tt’s, and hence by one deterministic tt if that function is of linear size increase. Thus, the only (functional) tree transformations that can be realized by a composition of tt’s but not by a single tt, are tree transformations of superlinear size increase.

The proof of the collapse of the hierarchy of compositions of dtt’s is based on the known fact that every dtt of linear size increase can be realized by a dtt that is “single-use”, which means that it never visits a node of the input tree twice in the same state. In fact, it is proved in [29, 32] that even dmt’s of linear size increase can be realized by single-use dtt’s. Vice versa, it is obvious that every single-use dtt is of linear size increase. In [7] it is shown that single-use dtt’s have the same power as deterministic mso tree transducers, which use formulas of monadic second-order logic to define the output tree in terms of the input tree (see [13, 14]).

By our main technical result, we may always assume that a composition of two dtt’s is linear-bounded. If the composition is of linear size increase, then the first dtt is obviously also of linear size increase, and can therefore be realized by a single-use dtt. We also prove that the composition of a single-use dtt with an arbitrary dtt can be realized by one dtt. Thus, altogether, if the composition of two dtt’s is of linear size increase, then it can be realized by a single-use dtt. This argument can easily be turned into an inductive proof for a composition of any number of dtt’s.

Complexity. We first consider deterministic tt’s. The translation realized by a deterministic tt can be computed on a RAM in linear time, in the sum of the sizes of the input and output tree. With respect to space, we prove that it can be computed on a deterministic Turing machine in linear space (again, in the sum of the sizes of the input and output tree). Since we may assume by our main technical result that the sizes of the intermediate results are at most linear in the size of the output tree, it should be clear that these facts also hold for compositions of dtt’s. We also consider output tree languages, i.e., the images of a regular tree language under a composition of dtt’s. Since the regular tree languages are closed under prunings, our technical decomposition result now implies that these output languages are in $\mbox{\sf DSPACE}(n)$ , i.e., can be recognized by a Turing machine in deterministic linear space (or, in other words, are deterministic context-sensitive). Since the yield of a tree can be computed by a dtt (representing it by a monadic tree), the output string languages, which are the yields of the output tree languages, are also in $\mbox{\sf DSPACE}(n)$ . The languages in the well-known io-hierarchy are examples of such output languages. For compositions of top-down tree transducers (even nondeterministic ones) this result on output languages was proved in [4], using a technical result very similar to ours.

Our results on nondeterministic tt’s (and their proofs) are very similar to those for dtt’s. The translation realized by a composition of tt’s can be computed by a nondeterministic Turing machine in simultaneous polynomial time and linear space (in the sum of the sizes of the input and output tree). The corresponding output languages can be recognized by such a Turing machine and hence are in NPTIME. Using the results on the membership problem for compositions of tt’s discussed in the next paragraph, we generalize the result of [4] and prove that these output languages are even in $\mbox{\sf DSPACE}(n)$ , which means that they are deterministic context-sensitive. The languages in the well-known oi-hierarchy are examples of such output languages.

Finally, we consider the membership problem for compositions of tt’s, which asks whether or not a given pair $(t,s)$ of input tree $t$ and output tree $s$ belongs to the composed translation. It follows easily from the above complexity results that for (non)deterministic tt’s the problem is decidable in (non)deterministic polynomial time and in (non)deterministic linear space. For the special case of the composition of a nondeterministic tt with a deterministic tt we prove that the problem is even in LOGCFL, i.e., log-space reducible to a context-free language, and hence in PTIME and $\mbox{\sf DSPACE}(\log^{2}n)$ . From this we conclude that for nondeterministic tt’s the problem is even decidable in deterministic linear space. However, for the special case of the composition of a deterministic tt with a nondeterministic one, the problem can be NP-complete. From the two special cases we obtain that the membership problem for a (single) nondeterministic macro tree transducer is in LOGCFL for io macro tree transducers (strengthening the result in [52] where it was shown to be in PTIME), whereas it can be NP-complete for oi macro tree transducers.

Structure of the paper. The reader is assumed to be familiar with the basics of formal language theory, in particular tree language theory, and complexity theory. The only formalisms used are tree-walking tree transducers (tt’s, of course), top-down tree transducers (tt↓’s, as a special case of tt’s), context-free grammars, regular tree grammars, and finite-state tree automata. Results on macro tree transducers are taken from the literature.

The main results are proved in Sections 8 to 12. Section 2 contains a number of preliminary notions, in particular linear-bounded composition, linear size increase, and regular look-around. In Section 3 we define the tree-walking tree transducer (with regular look-around), together with some of its special cases such as top-down and single-use. A tt that does not use regular look-around tests is called “local”. A “pruning” tt is a tt↓ that, roughly speaking, removes or relabels each node of the input tree and possibly deletes several of its children (together with their descendants). After giving two examples we present the composition hierarchy of dtt’s and end the section with some elementary syntactic properties of tt’s. In Section 4 it is shown how to separate the regular look-around from a tt and incorporate it into another tt. For instance, every tt can be decomposed into a deterministic pruning tt↓ that just relabels the nodes of the input tree (and hence does not really “prune”), followed by a local tt. We also state the fact that the domain of a tt is a regular tree language. Consequently, it is possible to define the regular tests of a tt as domains of other tt’s, which is a convenient technical tool. Section 5 contains three composition results. We prove that the composition of a tt with a pruning tt↓ can be realized by a tt (such that determinism is preserved). Together with the above-mentioned decomposition, this implies for instance that in a composition of two tt’s, the second tt can always be assumed to be local: the second tt splits off a pruning tt↓ that is absorbed by the first tt. In the deterministic case, we even prove that the composition of a dtt with an arbitrary dtt↓ can be realized by a dtt, and we also prove that the composition of a single-use dtt with a dtt can be realized by a dtt. Section 6 presents the known fact that every dtt of linear size increase can be realized by a single-use dtt, and discusses the relationship between tt’s, macro tree transducers, and mso tree transducers. In Section 7 we show that a (partial) function that can be realized by a composition of nondeterministic tt’s, can also be realized by a composition of deterministic tt’s. To prove this we first prove a lemma: for every tt↓ there is a deterministic tt↓ that realizes a “uniformizer” of the translation realized by the given tt↓, i.e., a function that is a subset of that translation, with the same domain. Section 8 contains our main technical result: every tt can be decomposed into a pruning tt and another tt such that the composition is linear-bounded. It implies (by splitting and absorbing) that a composition of tt’s can always be assumed to be linear-bounded. The “uniformizer” lemma of the previous section is applied to the pruning tt↓, proving the same result for deterministic tt’s. Section 9 presents the main results on linear size increase, and Sections 10 and 11 present the main results on the complexity of compositions of deterministic and nondeterministic tt’s, respectively. In Section 12 we prove the main results on the complexity of the membership problem for the composition of two tt’s. Finally, in Section 13 we show (in a straightforward way) that all main results also hold for transducers that transform unranked trees, or forests, which are a natural model of XML documents.

The reader who is interested only in complexity can disregard all results on single-use tt’s, and skip Sections 6.2 and 9. The reader who is interested only in expressivity can just skip Sections 10, 11, and 12.

Remarks on the literature. Top-down tree transducers were introduced in [66, 72]; regular look-ahead was added in [20]. Macro tree transducers were introduced in [15, 34]. Tree-walking tree transducers were introduced in [63] (where they are called 0-pebble tree transducers), and studied in, e.g., [31, 26, 62]. They were already mentioned in [24, Section 3(7)] (where they are called RT(Tree-walk) transducers). Regular look-around was added to tt’s in [14, Section 8.2] (where they are called ms tree-walking transducers); for tree-walking automata that was already done in [6]. However, formal models similar to the tt were introduced and studied before. The tree-walking automaton of [2] translates trees into strings. As explained in [24, Section 3(7)] and [31, Section 3.2], the tt is closely related to the attribute grammar [53], which is a well-known model of syntax-directed semantics (and a compiler construction tool). An attribute grammar translates derivation trees of an underlying context-free grammar into arbitrary values. Tree-valued attribute grammars were considered, e.g., in [27]. The attributed tree transducer, introduced in [38], is an operational version of the tree-valued attribute grammar, without underlying context-free grammar. Regular look-around was added to the attributed tree transducer in [7] (where it is called look-ahead). Attributed tree transducers are a special type of tt’s, of which the states are viewed as attributes of the nodes of the input tree. By definition a deterministic attributed tree transducer (like an attribute grammar) has to be noncircular, which means that it should generate an output tree whenever it is started in any state on any node of an input tree. Thus, it is total in a strong sense. This is natural from the point of view of syntax-directed semantics, but quite restrictive and inconvenient from the operational point of view of tree transformation. Several of the auxiliary results in Sections 3 to 5 are closely related to (and generalizations of) well-known results on attributed tree transducers (see, e.g., [39]). As an example, it is proved in [38, Theorem 4.3] that, for deterministic transducers, the composition of an attributed tree transducer with a top-down tree transducer can be realized by an attributed tree transducer. That does not immediately imply that the same is true for a dtt and a dtt↓, which we show in Section 5, because dtt’s are not necessarily total and they have regular look-around. Moreover, we wanted such results also to be understandable for readers unfamiliar with attribute grammars and attributed tree transducers.

The main results of this paper were first presented at FSTTCS ’02 [58] (on the complexity of compositions of deterministic mt’s), at FSTTCS ’03 [59] (on compositions of mt’s that realize functions of linear size increase), at FSTTCS ’08 [51] (on the complexity of compositions of nondeterministic mt’s), at PLAN-X ’09 [52] (on the complexity of the membership problem for mt’s), and in the Ph.D. Thesis of the second author [48] (on the last two subjects).

2 Preliminaries

Convention: All results stated and/or proved in this paper are effective.

Sets, strings, and relations. The set of natural numbers is ${\mathbb{N}}=\{0,1,2,\dots\}$ . For $m,n\in{\mathbb{N}}$ , we denote the interval $\{k\in{\mathbb{N}}\mid m\leq k\leq n\}$ by $[m,n]$ . The cardinality or size of a set $A$ is denoted by $\#(A)$ . The set of strings over $A$ is denoted by $A^{*}$ . It consists of all sequences $w=a_{1}\cdots a_{m}$ with $m\in{\mathbb{N}}$ and $a_{i}\in A$ for every $i\in[1,m]$ . The length $m$ of $w$ is denoted by $|w|$ . The empty string (of length [math]) is denoted by $\varepsilon$ . The concatenation of two strings $v$ and $w$ is denoted by $v\cdot w$ or just $vw$ . Moreover, $w^{0}=\varepsilon$ and $w^{k+1}=w\cdot w^{k}$ for $k\in{\mathbb{N}}$ .

The domain and range of a binary relation $R\subseteq A\times B$ are denoted by $\mathrm{dom}(R)$ and $\mathrm{ran}(R)$ , respectively. For $A^{\prime}\subseteq A$ , $R(A^{\prime})=\{b\in B\mid(a,b)\in R\text{ for some }a\in A^{\prime}\}$ . The composition of $R$ with a binary relation $S\subseteq B\times C$ is $R\circ S=\{(a,c)\mid\exists\,b\in B:(a,b)\in R,\,(b,c)\in S\}$ . The inverse of $R$ is $R^{-1}=\{(b,a)\mid(a,b)\in R\}$ . Note that $\mathrm{dom}(R\circ S)=R^{-1}(\mathrm{dom}(S))$ and $\mathrm{ran}(R\circ S)=S(\mathrm{ran}(R))$ . If $A=B$ then the transitive-reflexive closure of $R$ is $R^{*}=\bigcup_{k\in{\mathbb{N}}}R^{k}$ where $R^{0}=\{(a,a)\mid a\in A\}$ and $R^{k+1}=R\circ R^{k}$ . The composition of two classes of binary relations ${\cal R}$ and ${\cal S}$ is ${\cal R}\circ{\cal S}=\{R\circ S\mid R\in{\cal R},\,S\in{\cal S}\}$ . Moreover, ${\cal R}^{1}={\cal R}$ and ${\cal R}^{k+1}={\cal R}\circ{\cal R}^{k}$ for $k\geq 1$ . The relation $R$ is finitary if $R(a)$ is finite for every $a\in A$ , where $R(a)$ denotes $R(\{a\})$ . It is a (partial) function from $A$ to $B$ if $R(a)$ is empty or a singleton for every $a\in A$ , and it is a total function if, moreover, $\mathrm{dom}(R)=A$ .

Trees. An alphabet is a finite set of symbols. A ranked alphabet $\Sigma$ is an alphabet together with a mapping $\operatorname{rank}_{\Sigma}:\Sigma\to{\mathbb{N}}$ (of which the subscript $\Sigma$ will be dropped when it is clear from the context). The maximal rank of elements of $\Sigma$ is denoted ${\mathit{m}x}_{\Sigma}$ . For every $m\in{\mathbb{N}}$ we denote by $\Sigma^{(m)}$ the elements of $\Sigma$ that have rank $m$ .

Trees over $\Sigma$ are recursively defined to be strings over $\Sigma$ , as follows. For every $m\in{\mathbb{N}}$ , if $\sigma\in\Sigma^{(m)}$ and $t_{1},\dots,t_{m}$ are trees over $\Sigma$ , then $\sigma\,t_{1}\cdots t_{m}$ is a tree over $\Sigma$ . For readability we also write the tree $\sigma\,t_{1}\cdots t_{m}$ as the term $\sigma(t_{1},\dots,t_{m})$ . The set of all trees over $\Sigma$ is denoted $T_{\Sigma}$ ; thus $T_{\Sigma}\subseteq\Sigma^{*}$ . For an arbitrary finite set $A$ , disjoint with $\Sigma$ , we denote by $T_{\Sigma}(A)$ the set $T_{\Sigma\cup A}$ , where each element of $A$ has rank 0.

As usual trees are viewed as directed labeled graphs. The nodes of a tree $t$ are indicated by Dewey notation, i.e., by elements of ${\mathbb{N}}^{*}$ , which are strings of natural numbers. The root of $t$ is indicated by the empty string $\varepsilon$ , but will also be denoted by $\mathrm{root}_{t}$ for readability. The $i$ -th child of a node $u$ of $t$ is indicated by $ui$ , and there is a directed edge from the parent $u$ to the child $ui$ . Formally, the set ${\cal N}(t)$ of nodes of a tree $t=\sigma\,t_{1}\cdots t_{m}$ over $\Sigma$ can be defined recursively by ${\cal N}(t)=\{\varepsilon\}\cup\{iu\mid i\in[1,m],\,u\in{\cal N}(t_{i})\}$ . Thus, ${\cal N}(t)\subseteq[1,{\mathit{m}x}_{\Sigma}]^{*}$ . The root of $t=\sigma t_{1}\cdots t_{m}$ has label $\sigma$ , and the node $iu$ of $t$ has the same label as the node $u$ of $t_{i}$ . The rank of node $u$ is the rank of its label, i.e., the number of its children. A leaf is a node of rank 0, and a monadic node is a node of rank 1. Every node of $t$ has a child number: each node $ui$ has child number $i$ , and the root $\varepsilon$ is given child number [math] for technical convenience. For a node $u$ of $t$ the subtree of $t$ with root $u$ is denoted $t|_{u}$ ; thus, $t|_{\varepsilon}=t$ and $t|_{iu}=t_{i}|_{u}$ . A node $v$ of $t$ is a descendant of a node $u$ of $t$ , and $u$ is an ancestor of $v$ , if there exists $w\in{\mathbb{N}}^{*}$ such that $w\neq\varepsilon$ and $v=uw$ (thus, $u$ is not a descendant/ancestor of itself). The size of a tree $t$ is $|t|$ , i.e., its length as a string. Note that $|t|=\#({\cal N}(t))$ because the nodes of $t$ correspond one-to-one to the positions in the string $t$ , i.e., for every $\sigma\in\Sigma$ , each occurrence of $\sigma$ in $t$ corresponds to a node of $t$ with label $\sigma$ . The left-to-right linear order on ${\cal N}(t)$ according to this correspondence is called the pre-order of the nodes of $t$ . The yield of $t$ is the string of labels of its leaves, in pre-order. The height of $t$ is the number of edges of a longest directed path from the root of $t$ to a leaf; thus, it is the maximal length of its nodes (which are strings over ${\mathbb{N}}$ ).

A tree language $L$ is a set of trees over $\Sigma$ , for some ranked alphabet $\Sigma$ , i.e., $L\subseteq T_{\Sigma}$ . A tree translation $\tau$ is a binary relation between trees over $\Sigma$ and trees over $\Delta$ , for some ranked alphabets $\Sigma$ and $\Delta$ , i.e., $\tau\subseteq T_{\Sigma}\times T_{\Delta}$ .

Linear-bounded composition. Let $\Sigma$ , $\Delta$ , and $\Gamma$ be ranked alphabets. For tree translations $\tau_{1}\subseteq T_{\Sigma}\times T_{\Delta}$ and $\tau_{2}\subseteq T_{\Delta}\times T_{\Gamma}$ , we say that the pair $(\tau_{1},\tau_{2})$ is linear-bounded if there is a constant $c\in{\mathbb{N}}$ such that for every $(t,s)\in\tau_{1}\circ\tau_{2}$ there exists $r\in T_{\Delta}$ such that $(t,r)\in\tau_{1}$ , $(r,s)\in\tau_{2}$ , and $|r|\leq c\cdot|s|$ . Thus, the intermediate result $r$ can be chosen such that its size is linear in the size of the output $s$ . Note that if $\tau_{1}$ and $\tau_{2}$ are functions, this means that $|r|\leq c\cdot|\tau_{2}(r)|$ for every $r\in\mathrm{ran}(\tau_{1})\cap\mathrm{dom}(\tau_{2})$ .

For classes ${\cal T}_{1}$ and ${\cal T}_{2}$ of tree translations, we define ${\cal T}_{1}\ast{\cal T}_{2}$ to consist of all translations $\tau_{1}\circ\tau_{2}$ such that $\tau_{1}\in{\cal T}_{1}$ , $\tau_{2}\in{\cal T}_{2}$ , and $(\tau_{1},\tau_{2})$ is linear-bounded.

Lemma 1

Let ${\cal T}_{1}$ , ${\cal T}_{2}$ , and ${\cal T}_{3}$ be classes of tree translations. Then

[TABLE]

**Proof. **Let $\tau_{i}\in{\cal T}_{i}$ for $i\in\{1,2,3\}$ . If the pair $(\tau_{2},\tau_{3})$ is linear-bounded then so is the pair $(\tau_{1}\circ\tau_{2},\tau_{3})$ , with the same constant $c$ . If $(\tau_{1},\tau_{2})$ and $(\tau_{1}\circ\tau_{2},\tau_{3})$ are linear-bounded with constant $c_{1}$ and $c_{2}$ , respectively, then $(\tau_{1},\tau_{2}\circ\tau_{3})$ is linear-bounded with constant $c_{1}\cdot c_{2}$ . $\Box$

A function $\tau:T_{\Sigma}\to T_{\Delta}$ is of linear size increase if there is a constant $c\in{\mathbb{N}}$ such that $|\tau(t)|\leq c\cdot|t|$ for every $t\in\mathrm{dom}(\tau)$ . The class of functions of linear size increase will be denoted by LSIF.

Lemma 2

Let $\tau_{1}:T_{\Sigma}\to T_{\Gamma}$ and $\tau_{2}:T_{\Gamma}\to T_{\Delta}$ be functions such that $\mathrm{ran}(\tau_{1})\subseteq\mathrm{dom}(\tau_{2})$ . If $\tau_{1}\circ\tau_{2}\in\mbox{\sf LSIF}$ and $(\tau_{1},\tau_{2})$ is linear-bounded, then $\tau_{1}\in\mbox{\sf LSIF}$ .

**Proof. **It follows from $\mathrm{ran}(\tau_{1})\subseteq\mathrm{dom}(\tau_{2})$ that $\mathrm{dom}(\tau_{1}\circ\tau_{2})=\mathrm{dom}(\tau_{1})$ . Since $(\tau_{1},\tau_{2})$ is linear-bounded, there is a $c$ such that $|\tau_{1}(t)|\leq c\cdot|\tau_{2}(\tau_{1}(t))|$ for every $t\in\mathrm{dom}(\tau_{1})$ . Since $\tau_{1}\circ\tau_{2}\in\mbox{\sf LSIF}$ , there is a $c^{\prime}$ such that $|\tau_{2}(\tau_{1}(t))|\leq c^{\prime}\cdot|t|$ for every $t\in\mathrm{dom}(\tau_{1})$ . Hence $|\tau_{1}(t)|\leq c\cdot c^{\prime}\cdot|t|$ for every $t\in\mathrm{dom}(\tau_{1})$ , which means that $\tau_{1}\in\mbox{\sf LSIF}$ . $\Box$

Grammars and automata. Context-free grammars and, in particular, regular tree grammars will be used to define the computations of tree-walking tree transducers, and to define the “regular look-around” used by these transducers. A context-free grammar is specified as a tuple $G=(N,T,{\cal S},R)$ , where $N$ is the nonterminal alphabet, $T$ the terminal alphabet (disjoint with $N$ ), ${\cal S}\subseteq N$ the set of initial nonterminals, and $R$ the finite set of rules, where each rule is of the form $X\to\zeta$ with $X\in N$ and $\zeta\in(N\cup T)^{*}$ . A sentential form of $G$ is a string $v\in(N\cup T)^{*}$ such that $S\Rightarrow_{G}^{*}v$ for some $S\in{\cal S}$ , where $\Rightarrow_{G}$ is the usual derivation relation of $G$ : if $X\to\zeta$ is in $R$ , then $v_{1}Xv_{2}\Rightarrow_{G}v_{1}\zeta v_{2}$ for all $v_{1},v_{2}\in(N\cup T)^{*}$ . The language $L(G)$ generated by $G$ is the set of all terminal sentential forms, i.e., $L(G)=\{w\in T^{*}\mid\exists\,S\in{\cal S}:S\Rightarrow_{G}^{*}w\}$ . To formally define the derivation trees of $G$ as ranked trees, we need to subscript its nonterminals with ranks because $G$ can have rules $X\to\zeta_{1}$ and $X\to\zeta_{2}$ with $|\zeta_{1}|\neq|\zeta_{2}|$ . Let $\overline{N}$ be the ranked alphabet consisting of all symbols $X_{m}$ , of rank $m$ , such that $G$ has a rule $X\to\zeta$ with $|\zeta|=m$ . The terminal symbols in $T$ are given rank 0. Then the derivation trees of $G$ are generated by the context-free grammar $G^{\mathrm{der}}=(N^{\prime},\overline{N}\cup T,{\cal S}^{\prime},R^{\mathrm{der}})$ such that $N^{\prime}=\{X^{\prime}\mid X\in N\}$ , ${\cal S}^{\prime}=\{S^{\prime}\mid S\in{\cal S}\}$ , and if $R$ contains a rule $X\to\zeta$ , then $R^{\mathrm{der}}$ contains the rule $X^{\prime}\to X_{m}\zeta^{\prime}$ where $m=|\zeta|$ and $\zeta^{\prime}$ is obtained from $\zeta$ by changing every nonterminal $Y$ into $Y^{\prime}$ . Note that we only consider derivation trees that correspond to derivations $S\Rightarrow_{G}^{*}w$ with $S\in{\cal S}$ and $w\in T^{*}$ . Such a derivation tree has yield $w$ , because when taking the yield of a derivation tree we skip the leaves with label $X_{0}$ . Moreover, when considering a derivation tree of $G$ , we will disregard the subscripts of the nonterminals and we will say that a node has label $X$ rather than $X_{m}$ . As an example, if $G$ has the rules $S\to aXYb$ , $X\to aY$ , $Y\to ba$ , and $Y\to\varepsilon$ , then $G^{\mathrm{der}}$ has the rules $S^{\prime}\to S_{4}aX^{\prime}Y^{\prime}b$ , $X^{\prime}\to X_{2}aY^{\prime}$ , $Y^{\prime}\to Y_{2}ba$ , and $Y^{\prime}\to Y_{0}$ . The string $aabab$ is generated by $G$ , and the derivation tree $S_{4}aX_{2}aY_{0}Y_{2}bab=S_{4}(a,X_{2}(a,Y_{0}),Y_{2}(ba),b)$ is generated by $G^{\mathrm{der}}$ ; the nodes of this tree are labeled by $S$ , $X$ , $Y$ , $a$ , and $b$ , and its yield is $aabab$ .

A context-free grammar is $\varepsilon$ -free if it does not have $\varepsilon$ -rules, i.e., rules $X\to\varepsilon$ . We will mainly deal with $\varepsilon$ -free context-free grammars.

A context-free grammar $G$ is finitary if $L(G)$ is finite. We need the following elementary lemma on finitary context-free grammars.

Lemma 3

Let $G=(N,T,{\cal S},R)$ be a finitary context-free grammar. For every string $w\in L(G)$ there exists a derivation tree $d\in L(G^{\mathrm{der}})$ such that the yield of $d$ is $w$ and the height of $d$ is at most $\#(N)$ .

**Proof. **Let $d$ be a derivation tree with yield $w$ and suppose that a node $u$ of $d$ and a descendant $v$ of $u$ have the same nonterminal label (disregarding the ranking subscripts). Then the tree $d$ can be pumped in the usual way. But since $L(G)$ is finite, the yield of the pumped tree remains the same. Hence we can remove the pumped part from $d$ . Repeating this, we obtain a derivation tree as required. $\Box$

A context-free grammar $G=(N,T,{\cal S},R)$ is forward deterministic if ${\cal S}$ is a singleton and distinct rules have distinct left-hand sides.333That is as opposed to a “backward deterministic” context-free grammar in which distinct rules have distinct right-hand sides, see, e.g., [26]. A forward deterministic context-free grammar that generates a string is also called a “straight-line” context-free grammar.

Such a grammar generates at most one string in $T^{*}$ and has at most one derivation tree. If $L(G^{\mathrm{der}})=\{d\}$ , then the height of $d$ is at most $\#(N)$ by Lemma 3.

A regular tree grammar is a context-free grammar $G=(N,\Sigma,{\cal S},R)$ such that $\Sigma$ is a ranked alphabet, and $\zeta\in T_{\Sigma}(N)$ for every rule $X\to\zeta$ in $R$ . A regular tree grammar generates trees over $\Sigma$ , i.e., $L(G)\subseteq T_{\Sigma}$ . Note that every regular tree grammar is $\varepsilon$ -free. Note also that for every context-free grammar $G$ , the grammar $G^{\mathrm{der}}$ is a regular tree grammar. If, in particular, $G$ is itself a regular tree grammar, as above, then it should be noted that the elements of $\Sigma$ all have rank 0 in $G^{\mathrm{der}}$ . As an example, if $G$ has the rules $S\to\sigma(X,Y)$ , $X\to\tau(Y)$ , $Y\to\tau(a)$ , and $Y\to a$ , where $\sigma$ , $\tau$ , and $a$ have ranks 2, 1 and 0, respectively, then $G^{\mathrm{der}}$ has the rules $S^{\prime}\to S_{3}(\sigma,X^{\prime},Y^{\prime})$ , $X^{\prime}\to X_{2}(\tau,Y^{\prime})$ , $Y^{\prime}\to Y_{2}(\tau,a)$ , and $Y^{\prime}\to Y_{1}(a)$ . The tree $\sigma(\tau(\tau(a)),a)$ is generated by $G$ , and the derivation tree $S_{3}(\sigma,X_{2}(\tau,Y_{2}(\tau,a)),Y_{1}(a))$ by $G^{\mathrm{der}}$ .

A (total deterministic) bottom-up finite-state tree automaton is specified as a tuple $A=(\Sigma,P,F,\delta)$ where $\Sigma$ is a ranked alphabet, $P$ is a finite set of states, $F\subseteq P$ is the set of final states, and $\delta$ is the state transition function such that $\delta(\sigma,p_{1},\dots,p_{m})\in P$ for every $\sigma\in\Sigma$ and $p_{1},\dots,p_{m}\in P$ , where $m$ is the rank of $\sigma$ . For every $t\in T_{\Sigma}$ , we define the state $\delta(t)$ in which $A$ arrives at the root of $t$ recursively by $\delta(\sigma\,t_{1}\cdots t_{m})=\delta(\sigma,\delta(t_{1}),\dots,\delta(t_{m}))$ . The tree language recognized by $A$ is $L(A)=\{t\in T_{\Sigma}\mid\delta(t)\in F\}$ .

A regular tree language is a set of trees that can be generated by a regular tree grammar, or equivalently, recognized by a bottom-up finite-state tree automaton. The class of regular tree languages will be denoted by REGT. The basic properties of regular tree languages can be found in, e.g., [43, 44, 11, 19].

Regular look-around. Let $\Sigma$ be a ranked alphabet. A node test over $\Sigma$ is a set of trees over $\Sigma$ with a distinguished node, i.e., it is a subset of the set

[TABLE]

Intuitively it is a property of a node of a tree.

We introduce a new ranked alphabet $\Sigma\times\{0,1\}$ , such that the rank of $(\sigma,b)$ equals that of $\sigma$ in $\Sigma$ . For a tree $t$ over $\Sigma$ and a node $u$ of $t$ we define $\operatorname{mark}(t,u)$ to be the tree over $\Sigma\times\{0,1\}$ that is obtained from $t$ by changing the label $\sigma$ of $u$ into $(\sigma,1)$ and changing the label $\sigma$ of every other node into $(\sigma,0)$ . Thus, $\operatorname{mark}(t,u)$ is $t$ with one “marked” node $u$ . A regular (node) test over $\Sigma$ is a node test $T\subseteq T^{\bullet}_{\Sigma}$ such that its marked representation is a regular tree language, i.e., $\operatorname{mark}(T)\in\mbox{\sf REGT}$ . Note that $\varnothing$ and $T^{\bullet}_{\Sigma}$ are regular tests, and that the class of regular tests over $\Sigma$ is closed under the boolean operations complement, intersection, and union, because REGT is closed under those operations. Hence every boolean combination of regular tests is again a regular test.

For a tree language $L\subseteq T_{\Sigma}$ we define the node test

[TABLE]

over $\Sigma$ . Intuitively it is a property of the distinguished node that only depends on the subtree at that node. Clearly, if $L$ is regular then $T(L)$ is regular. A regular test of the form $T(L)$ with $L\in\mbox{\sf REGT}$ will be called a regular sub-test. Note that $T(T_{\Sigma})=T^{\bullet}_{\Sigma}$ and $T(\varnothing)=\varnothing$ . Note also that for regular tree languages $L$ and $L^{\prime}$ over $\Sigma$ , $T(L)\cap T(L^{\prime})=T(L\cap L^{\prime})$ and $T^{\bullet}_{\Sigma}\setminus T(L)=T(T_{\Sigma}\setminus L)$ . This shows that the class of regular sub-tests over $\Sigma$ is also closed under the boolean operations complement, intersection, and union.

For a given node test $T$ over $\Sigma$ , we also wish to be able to apply $T$ to a node $v$ of a tree $\operatorname{mark}(t,u)$ , where $v$ need not be equal to $u$ . Thus, we define the node test $\mu(T)$ over $\Sigma\times\{0,1\}$ to consist of all $(\operatorname{mark}(t,u),v)$ such that $(t,v)\in T$ and $u\in{\cal N}(t)$ . The test $\mu(T)$ just disregards the marking of $t$ . It is easy to see that if $T$ is regular, then so is $\mu(T)$ .

The reader familiar with monadic second-order logic (abbreviated mso logic) should realize that it easily follows from the result of Doner, Thatcher and Wright [18, 73] that a node test is regular if and only if it is mso definable (see [6, Lemma 7]). A node test $T$ over $\Sigma$ is mso definable if there is an mso formula $\varphi(x)$ over $\Sigma$ , with one free variable $x$ , such that $T=\{(t,u)\mid t\models\varphi(u)\}$ , where $t\models\varphi(u)$ means that the formula $\varphi(x)$ holds in $t$ for the node $u$ as value of $x$ . The formulas of mso logic on trees over $\Sigma$ use the atomic formulas $\mathrm{lab}_{\sigma}(x)$ and ${\rm down}_{i}(x,y)$ , for every $i\in[1,{\mathit{m}x}_{\Sigma}]$ , meaning that node $x$ has label $\sigma\in\Sigma$ , and that $y$ is the $i$ -th child of $x$ , respectively. In the literature, regular tests are also called mso tests.

3 Tree-Walking Tree Transducers

In this section we define tree-walking tree transducers, with and without regular look-around, and discuss some of their properties.

A tree-walking tree transducer (with regular look-around), in short tt, is a finite state device with one reading head that walks from node to node over its input tree following the edges in either direction. In addition to testing the label and child number of the current node, it can even test any regular property of that node. The output tree is produced recursively, in a top-down fashion. When the transducer produces a node of the output tree, labeled by an output symbol of rank $k$ , it branches into $k$ copies of itself, which then proceed independently, in parallel, to produce the subtrees rooted at the children of that output node.

The tt is specified as a tuple $M=(\Sigma,\Delta,Q,Q_{0},R)$ , where $\Sigma$ and $\Delta$ are ranked alphabets of input and output symbols, $Q$ is a finite set of states, $Q_{0}\subseteq Q$ is the set of initial states, and $R$ is a finite set of rules. The rules are divided into move rules and output rules. Each move rule is of the form $\langle q,\sigma,j,T\rangle\to\langle q^{\prime},\alpha\rangle$ such that $q,q^{\prime}\in Q$ , $\sigma\in\Sigma$ , $j\in[0,{\mathit{m}x}_{\Sigma}]$ , $T$ is a regular test over $\Sigma$ (specified in some effective way), and $\alpha$ is one of the following instructions:

[TABLE]

Each output rule is of the form $\langle q,\sigma,j,T\rangle\to\delta(\langle q_{1},\alpha_{1}\rangle,\dots,\langle q_{k},\alpha_{k}\rangle)$ such that the left-hand side is as above, $\delta\in\Delta^{(k)}$ , $q_{1},\dots,q_{k}\in Q$ , and $\alpha_{1},\dots,\alpha_{k}$ are instructions as above. A rule $\langle q,\sigma,j,T\rangle\to\zeta$ with $T=T^{\bullet}_{\Sigma}$ will be written $\langle q,\sigma,j\rangle\to\zeta$ . The tt $M$ is deterministic, in short a dtt, if $Q_{0}$ is a singleton, and $T\cap T^{\prime}=\varnothing$ for every two distinct rules $\langle q,\sigma,j,T\rangle\to\zeta$ and $\langle q,\sigma,j,T^{\prime}\rangle\to\zeta^{\prime}$ in $R$ . A dtt with initial state $q_{0}$ will be specified as $M=(\Sigma,\Delta,Q,q_{0},R)$ .

A configuration $\langle q,u\rangle$ of the tt $M$ on a tree $t$ over $\Sigma$ is given by the current state $q$ of $M$ and the current position $u$ of the head of $M$ on $t$ . Formally, $q\in Q$ and $u\in{\cal N}(t)$ . The set of all configurations of $M$ on $t$ is denoted $\operatorname{Con}(t)$ , i.e., $\operatorname{Con}(t)=Q\times{\cal N}(t)$ . A rule $\langle q,\sigma,j,T\rangle\to\zeta$ is applicable to a configuration $\langle q^{\prime},u\rangle$ of $M$ on $t$ if $q^{\prime}=q$ and $u$ satisfies the tests $\sigma$ , $j$ , and $T$ , i.e., $\sigma$ and $j$ are the label and child number of $u$ , and $(t,u)\in T$ . For a node $u$ of $t$ and an instruction $\alpha$ we define the node $\alpha(u)$ of $t$ as follows: if $\alpha$ is ${\rm stay}$ , ${\rm up}$ , or ${\rm down}_{i}$ , then $\alpha(u)$ equals $u$ , is the parent of $u$ , or is the $i$ -th child of $u$ , respectively.

For every input tree $t\in T_{\Sigma}$ we define the regular tree grammar $G_{M,t}=(N,\Delta,{\cal S},R_{M,t})$ where $N=\operatorname{Con}(t)$ , ${\cal S}=\{\langle q_{0},\mathrm{root}_{t}\rangle\mid q_{0}\in Q_{0}\}$ and $R_{M,t}$ is defined as follows. Let $\langle q,u\rangle$ be a configuration of $M$ on $t$ and let $\langle q,\sigma,j,T\rangle\to\zeta$ be a rule of $M$ that is applicable to $\langle q,u\rangle$ . If $\zeta=\langle q^{\prime},\alpha\rangle$ then $R_{M,t}$ contains the rule $\langle q,u\rangle\to\langle q^{\prime},\alpha(u)\rangle$ , and if $\zeta=\delta(\langle q_{1},\alpha_{1}\rangle,\dots,\langle q_{k},\alpha_{k}\rangle)$ then $R_{M,t}$ contains the rule $\langle q,u\rangle\to\delta(\langle q_{1},\alpha_{1}(u)\rangle,\dots,\langle q_{k},\alpha_{k}(u)\rangle)$ . The derivation relation $\Rightarrow_{G_{M,t}}$ will be written as $\Rightarrow_{M,t}$ . The translation realized by $M$ , denoted $\tau_{M}$ , is defined as $\tau_{M}=\{(t,s)\in T_{\Sigma}\times T_{\Delta}\mid s\in L(G_{M,t})\}$ . In other words, $\tau_{M}=\{(t,s)\in T_{\Sigma}\times T_{\Delta}\mid\exists\,q_{0}\in Q_{0}:\langle q_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M,t}s\}$ . Two tt’s $M$ and $N$ are equivalent if $\tau_{M}=\tau_{N}$ .

The domain of $M$ , denoted by $\mathrm{dom}(M)$ , is defined to be the domain of the translation $\tau_{M}$ , i.e., $\mathrm{dom}(M)=\mathrm{dom}(\tau_{M})=\{t\in T_{\Sigma}\mid\exists\,s\in T_{\Delta}:(t,s)\in\tau_{M}\}$ . The tt $M$ is total if $\mathrm{dom}(M)=T_{\Sigma}$ .

The tt $M$ is finitary if $\tau_{M}$ is finitary, which means that $\tau_{M}(t)$ is finite (or equivalently, that $G_{M,t}$ is finitary) for every input tree $t\in T_{\Sigma}$ . All classical top-down tree transducers (with or without regular look-ahead) and all macro tree transducers are finitary.

If $M$ is deterministic, then at most one rule of $M$ is applicable to a given configuration. Hence $G_{M,t}$ is forward deterministic and $L(G_{M,t})$ is either empty or a singleton. Thus, $\tau_{M}$ is a partial function from $T_{\Sigma}$ to $T_{\Delta}$ (and a total function if $M$ is total). For every $(t,s)\in\tau_{M}$ the context-free grammar $G_{M,t}$ has exactly one derivation tree, with root label $\langle q_{0},\mathrm{root}_{t}\rangle$ and yield $s$ .

Intuitively, the derivation relation $\Rightarrow_{M,t}$ of the grammar $G_{M,t}$ formalizes the computation steps of the tt $M$ on the input tree $t$ , the derivations of $G_{M,t}$ are the sequential computations of $M$ on $t$ , and the derivation trees of $G_{M,t}$ , generated by the regular tree grammar $G_{M,t}^{\mathrm{der}}$ , model the parallel computations of the independent copies of $M$ on $t$ . If $M$ is deterministic and $t\in\mathrm{dom}(M)$ , then $M$ has exactly one parallel computation on $t$ .

A sentential form of $G_{M,t}$ will be called an output form of $M$ on $t$ . It is a tree $s\in T_{\Delta}(\operatorname{Con}(t))$ such that $\langle q_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M,t}s$ for some $q_{0}\in Q_{0}$ . Intuitively, such an output form $s$ consists on the one hand of $\Delta$ -labeled nodes that were produced by $M$ previously in the computation, using output rules, and on the other hand of leaves that represent the independent copies of $M$ into which the computation has branched previously, due to those output rules, where each leaf is labeled by the current configuration of that copy. An output form is initial if it is the configuration $\langle q_{0},\mathrm{root}_{t}\rangle$ for some $q_{0}\in Q_{0}$ , where $\mathrm{root}_{t}$ is the root of $t$ , and it is final if it is in $T_{\Delta}$ , which means that all copies of $M$ have disappeared.

Intuitively, the computation steps of $M$ lead from one output form to another, as follows. Let $s$ be an output form and let $v$ be a leaf of $s$ with label $\langle q,u\rangle\in\operatorname{Con}(t)$ . If $\langle q,u\rangle\to\langle q^{\prime},\alpha(u)\rangle$ is a rule of $G_{M,t}$ , resulting from a move rule $\langle q,\sigma,j,T\rangle\to\langle q^{\prime},\alpha\rangle$ of $M$ that is applicable to configuration $\langle q,u\rangle$ , as defined above, then $s\Rightarrow_{M,t}s^{\prime}$ where $s^{\prime}$ is obtained from $s$ by changing the label of $v$ into $\langle q^{\prime},\alpha(u)\rangle$ . Thus, this copy of $M$ just changes its configuration. Moreover, if $\langle q,u\rangle\to\delta(\langle q_{1},\alpha_{1}(u)\rangle,\dots,\langle q_{k},\alpha_{k}(u)\rangle)$ is a rule of $G_{M,t}$ , resulting from an output rule $\langle q,\sigma,j,T\rangle\to\delta(\langle q_{1},\alpha_{1}\rangle,\dots,\langle q_{k},\alpha_{k}\rangle)$ of $M$ , as defined above, then $s\Rightarrow_{t,M}s^{\prime}$ where $s^{\prime}$ is obtained from $s$ by changing the label of $v$ into $\delta$ and adding children $v1,\dots,vm$ with labels $\langle q_{1},\alpha_{1}(u)\rangle,\dots,\langle q_{m},\alpha_{m}(u)\rangle$ , respectively. Thus, $M$ outputs $\delta$ , and for each child $vi$ it branches into a new process, a copy of itself started in state $q_{i}$ at the node $\alpha_{i}(u)$ . In the particular case that $k=0$ , $s^{\prime}$ is obtained from $s$ by changing the label of $v$ into $\delta$ ; thus, the copy of $M$ corresponding to the node $v$ of $s$ disappears. The translation $\tau_{M}$ realized by $M$ consists of all pairs of trees $t$ over $\Sigma$ and $s$ over $\Delta$ such that $M$ has a sequential computation on $t$ that starts with an initial output form and ends with the final output form $s$ .

Before giving an example of a tree-walking tree transducer, we define six properties of tt’s that will be used throughout this paper.

The tt $M$ is sub-testing, abbreviated tt ${}^{\hskip 1.13791pt\mathrm{s}}$ , if the regular tests used by $M$ are regular sub-tests, i.e., only test the subtree at the current node. Formally, for every rule $\langle q,\sigma,j,T\rangle\to\zeta$ there is a regular tree language $L$ over $\Sigma$ such that $T=T(L)$ . Recall that $T(L)=\{(t,u)\mid t|_{u}\in L\}$ . Thus, informally, $M$ is sub-testing if it uses regular look-ahead rather than the more general regular look-around.

The tt $M$ is local, abbreviated ttℓ, if it does not use regular tests, i.e., $T=T^{\bullet}_{\Sigma}$ ( $=\{(t,u)\mid t\in T_{\Sigma},u\in{\cal N}(t)\}$ ) for every rule $\langle q,\sigma,j,T\rangle\to\zeta$ . So all its rules are written $\langle q,\sigma,j\rangle\to\zeta$ . Recall that $T^{\bullet}_{\Sigma}=T(T_{\Sigma})$ ; thus, every local tt is sub-testing. Note that in the formalism of the (non-local) tt the tests on $\sigma$ and $j$ could be dropped from a rule $\langle q,\sigma,j,T\rangle\to\zeta$ , because they can be incorporated in the regular test $T$ .

The tt $M$ is top-down, abbreviated tt↓, if it does not use the up-instruction in the right-hand sides of its rules. Due to the use of stay-instructions, a tt↓ need not be finitary. It is straightforward to show that the finitary (deterministic) tt ${}^{\hskip 1.13791pt\mathrm{s}}_{\downarrow}$ and tt ${}^{\ell}_{\downarrow}$ are equivalent to the classical nondeterministic (deterministic) top-down tree transducer, with and without regular look-ahead, respectively; see the end of this section. Note that in the rules of a tt ${}^{\hskip 1.13791pt\mathrm{s}}_{\downarrow}$ or tt ${}^{\ell}_{\downarrow}$ the test on the child number $j$ could be dropped, because $j$ can be stored in the finite state if necessary.

The tt $M$ is single-use, abbreviated ttsu, if it is deterministic and never visits a node of the input tree twice in the same state. Formally, it should satisfy the following property: for every $t\in T_{\Sigma}$ , $s^{\prime}\in T_{\Delta}(\operatorname{Con}(t))$ , $s\in T_{\Delta}$ , and $\langle q,u\rangle\in\operatorname{Con}(t)$ , if $\langle q_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M,t}s^{\prime}\Rightarrow^{*}_{M,t}s$ then $\langle q,u\rangle$ occurs at most once in $s^{\prime}$ . In other words, for every $t\in\mathrm{dom}(M)$ , no nonterminal occurs twice in the (unique) derivation tree $d$ of the context-free grammar $G=G_{M,t}$ . Note that, as discussed in the proof of Lemma 3 (and the paragraph following it), the configuration $\langle q,u\rangle$ cannot occur at two distinct nodes on a path from the root of $d$ to a leaf. The single-use property also forbids $\langle q,u\rangle$ to occur at two independent nodes of $d$ . It was introduced for attribute grammars in [40, 41, 45].

The tt $M$ is pruning, abbreviated ttpru, if it is a top-down tt of which each move rule is of the form $\langle q,\sigma,j,T\rangle\to\langle q^{\prime},{\rm down}_{i}\rangle$ , and each output rule is of the form $\langle q,\sigma,j,T\rangle\to\delta(\langle q_{1},{\rm down}_{i_{1}}\rangle,\dots,\langle q_{k},{\rm down}_{i_{k}}\rangle)$ such that $1\leq i_{1}<\cdots<i_{k}\leq\operatorname{rank}(\sigma)$ . Intuitively, a pruning tt is a tt↓ without stay-instructions that, when arriving at an input node $u$ , either removes $u$ and all its children except one (together with the descendants of those children), or relabels $u$ and possibly removes some of its children (together with their descendants). Since a ttpru does not use the stay-instruction, it is finitary (and single-use if it is deterministic). Every tt ${}^{\hskip 1.13791pt\mathrm{s}}_{\mathrm{pru}}$ and tt ${}^{\ell}_{\mathrm{pru}}$ is equivalent to a classical linear top-down tree transducer, with and without regular look-ahead, but not vice versa because the latter transducer can generate an arbitrary finite number of output nodes at each computation step, rather than zero or one.

The tt $M$ is relabeling, abbreviated ttrel, if every rule of $M$ is an output rule of the form $\langle q,\sigma,j,T\rangle\to\delta(\,\langle q_{1},{\rm down}_{1}\rangle,\dots,\langle q_{m},{\rm down}_{m}\rangle)$ where $m=\operatorname{rank}_{\Sigma}(\sigma)=\operatorname{rank}_{\Delta}(\delta)$ . Thus, the label $\sigma$ is replaced by the label $\delta$ . Obviously, every relabeling tt is pruning.

We use the notation TT for the class of translations realized by tree-walking tree transducers, and f TT and dTT for the subclasses realized by finitary and deterministic tt’s, respectively. Thus, $\mbox{\sf dTT}\subseteq\mbox{\sf f\,TT}\subseteq\mbox{\sf TT}$ . The subclasses of TT, f TT, and dTT realized by tt’s with the above six properties (and their combinations) are indicated by the superscripts ‘s’ and ‘ $\ell$ ’, and the subscripts ‘ $\downarrow$ ’, ‘su’, ‘pru’, and ‘rel’, as above. For instance, dTT ${}^{\hskip 1.13791pt\mathrm{s}}_{\downarrow}$ denotes the class of translations realized by deterministic tree-walking tree transducers that are both sub-testing and top-down. Note that TTℓ is a proper subclass of TT ${}^{\hskip 1.13791pt\mathrm{s}}$ , because a local tt of which all output symbols have rank 0 can be viewed as a tree-walking automaton, which cannot recognize all regular tree languages by the result of [9].

By [14, Section 8.4], the tt is equivalent to the ms tree-walking transducer of [14, Section 8.2]. As discussed in the Introduction, the ttℓ generalizes the attributed tree transducer of [38], which is required to be noncircular and hence finitary; the deterministic attributed tree transducer is also required to be total.444The tt $M$ is circular if there exist $t\in T_{\Sigma}$ , $u\in{\cal N}(t)$ , $q\in Q$ , and $s\in T_{\Delta}(\operatorname{Con}(t))$ such that $\langle q,u\rangle\Rightarrow^{*}_{M,t}s$ and $\langle q,u\rangle$ occurs in $s$ . Thus, $M$ is noncircular if and only if $G_{M,t}$ is nonrecursive for every $t\in T_{\Sigma}$ , which implies that $L(G_{M,t})$ is finite. Note that a total deterministic tt is noncircular if and only if for every $t\in T_{\Sigma}$ , $u\in{\cal N}(t)$ , and $q\in Q$ there exists $s\in T_{\Delta}$ such that $\langle q,u\rangle\Rightarrow^{*}_{M,t}s$ . It can be shown that for every finitary tt there is an equivalent noncircular tt, but that will not be needed in this paper.

In the same way the deterministic tt generalizes the (deterministic) attributed tree transducer with look-ahead of [7]. In [26] all tree-walking tree transducers are local.

Example 4

Let $\Sigma=\{\sigma,e\}$ with $\operatorname{rank}_{\Sigma}(\sigma)=2$ and $\operatorname{rank}_{\Sigma}(e)=0$ , and let $\Delta=\{\sigma,e\}\cup[1,{\mathit{m}x}_{\Sigma}]$ with $\operatorname{rank}_{\Delta}(\sigma)=2$ , $\operatorname{rank}_{\Delta}(e)=0$ , and $\operatorname{rank}_{\Delta}(j)=0$ for every $j\in[1,{\mathit{m}x}_{\Sigma}]=\{1,2\}$ . Moreover, let $T$ be an arbitrary regular node test over $\Sigma$ . For simplicity we assume that $T$ is not satisfied at the leaves of $t$ , i.e., if $(t,u)\in T$ then $u$ is not a leaf of $t$ . For instance, $T$ consists of all $(t,u)\in T^{\bullet}_{\Sigma}$ such that $u$ has at least one ancestor that has exactly one child that is a leaf, and at least one descendant with that same property. We consider a total deterministic tt $M=(\Sigma,\Delta,Q,q_{0},R)$ that performs $T$ as a query, i.e., for every input tree $t$ it outputs all nodes of $t$ that satisfy $T$ , in pre-order. More precisely, if $u_{1},\dots,u_{n}$ are the nodes $u$ of $t$ such that $(t,u)\in T$ , in pre-order, then $M$ outputs the tree $s=\sigma(s_{1},\sigma(s_{2},\dots\sigma(s_{n},e)\cdots))$ where $s_{i}=\sigma(\cdots\sigma(\sigma(e,j_{1}),j_{2})\dots,j_{k})$ if $u_{i}=j_{1}j_{2}\cdots j_{k}$ with $j_{1},j_{2},\dots,j_{k}\in[1,{\mathit{m}x}_{\Sigma}]$ . Note that the yield of $s$ is $eu_{1}eu_{2}\cdots eu_{n}e$ . The transducer $M$ performs a left-to-right depth-first traversal of the input tree $t$ and applies the test $T$ to every node of $t$ , in pre-order. Whenever $M$ finds a node $u_{i}$ that satisfies the test, it branches into two copies. The first copy outputs the tree $s_{i}$ with yield $eu_{i}$ , walking from $u$ to the root, and the second copy continues the traversal.

Formally, $M$ has the set of states $Q=\{d,u_{1},u_{2},p,p^{\prime}\}$ and initial state $q_{0}=d$ . Intuitively, $d$ stands for ‘down’, $u_{j}$ for ‘up from the $j$ -th child’, and $p$ for ‘print’. It has the following rules, where $j^{\prime}\in[0,{\mathit{m}x}_{\Sigma}]$ , $j\in[1,{\mathit{m}x}_{\Sigma}]$ , $T^{\mathrm{c}}=T^{\bullet}_{\Sigma}\setminus T$ , and $\tau\in\Sigma$ :

[TABLE]

where the rule $\langle p,\tau,j\rangle\to\sigma(\langle p,{\rm up}\rangle,j)$ abbreviates the two rules

[TABLE]

The tt $M$ does not have any of the six properties defined above. Note that $M$ is not single-use because it pays $n$ visits to the root of $t$ in state $p$ . For the example test $T$ it is not clear whether there is a local tt equivalent to $M$ , but that does not seem likely. $\Box$

Example 5

Let $\Sigma=\{\sigma,e\}$ as in Example 4. We consider a total deterministic local tt $M_{\mathrm{exp}}$ that translates each tree $t$ with $n$ leaves into the full binary tree of height $n$ with $2^{n}$ leaves. As in Example 4, it performs a depth-first left-to-right traversal of $t$ , and branches into two copies whenever it visits a leaf of $t$ . Formally, $M_{\mathrm{exp}}=(\Sigma,\Sigma,Q,q_{0},R)$ with $Q=\{d,u_{1},u_{2},q\}$ and $q_{0}=d$ . Its rules are similar to those of $M$ in Example 4. In particular, the three rules for states $u_{1}$ and $u_{2}$ are the same. The rules for state $d$ are the following, with $j^{\prime}\in[0,{\mathit{m}x}_{\Sigma}]$ and $j\in[1,{\mathit{m}x}_{\Sigma}]$ :

[TABLE]

where the last rule abbreviates the two rules $\langle d,e,0\rangle\to\sigma(\langle q,{\rm stay}\rangle,\langle q,{\rm stay}\rangle)$ and $\langle q,e,0\rangle\to e$ . $\Box$

An elementary property of the translation realized by a deterministic tt is that it is of “linear size-height increase”, as stated in the next lemma. Since the size of a tree is at most exponential in its height, this implies that it is of exponential size increase. This is well known for attributed tree transducers [38, Lemma 4.1] (see also [39, Lemma 5.40]) and for local tt’s [31, Lemma 7], and obviously also holds for tt’s. If, moreover, the tt is single-use, then it is of linear size increase.

Lemma 6

For every $\tau\in\mbox{\sf dTT}$ there is a constant $c$ such that for every $(t,s)\in\tau$ the height of $s$ is at most $c\cdot|t|$ . Moreover, $\mbox{\sf dTT$ {}_{\mathrm{su}} $}\subseteq\mbox{\sf LSIF}$ .

**Proof. **Let $M=(\Sigma,\Delta,Q,q_{0},R)$ be a dtt and let $(t,s)\in\tau_{M}$ . Let $d$ be the unique derivation tree generated by $G^{\mathrm{der}}_{M,t}$ . Clearly, since each rule of $M$ outputs at most one node of $s$ , the height of $s$ is at most the height of $d$ . By Lemma 3 the height of $d$ is at most $\#(\operatorname{Con}(t))$ , which equals $\#(Q)\cdot|t|$ . Thus, we can take $c=\#(Q)$ .

It should also be clear that the size of $s$ is at most the number of nodes of $d$ that are labeled by a configuration. If $M$ is single-use, then no configuration occurs twice in $d$ . Hence $|s|\leq\#(Q)\cdot|t|$ , i.e., the function $\tau_{M}$ is of linear size increase. $\Box$

Example 5 and Lemma 6 imply that compositions of deterministic tt’s form a proper hierarchy. This was proved for attributed tree transducers in [38, Corollary 4.1] (see also [39, Theorem 5.45]), and the proof for tt’s is exactly the same.

Proposition 7

For every $k\geq 1$ , $\mbox{\sf dTT}^{k}\subsetneq\mbox{\sf dTT}^{k+1}$ .

**Proof. **Let $\tau_{\mathrm{exp}}$ be the translation realized by the dtt $M_{\mathrm{exp}}$ of Example 5. Then $\tau_{\mathrm{exp}}\circ\tau_{\mathrm{exp}}$ translates each tree $t$ with $n$ leaves into the full binary tree of height $2^{n}$ with $2^{2^{n}}$ leaves. Since $|t|=2n-1$ , it follows from Lemma 6 that $\tau_{\mathrm{exp}}\circ\tau_{\mathrm{exp}}$ is not in dTT. Hence $\mbox{\sf dTT}\subsetneq\mbox{\sf dTT}^{2}$ . In a similar way it can be shown that $\tau_{\mathrm{exp}}^{k+1}$ is not in $\mbox{\sf dTT}^{k}$ . Since the size of a tree is at most exponential in its height, it follows from Lemma 6 that for every $\tau\in\mbox{\sf dTT}^{2}$ there is a constant $c$ such that for every $(t,s)\in\tau$ the height of $s$ is at most $2^{c\cdot|t|}$ . Similarly for $\tau\in\mbox{\sf dTT}^{k}$ , the height of $s$ is at most ( $k-1$ )-fold exponential in $|t|$ . $\Box$

Thus, in terms of size increase, a composition of $k$ dtt’s can create at most a $k$ -fold exponentially large output tree, whereas a composition of $k+1$ dtt’s can naturally create an output tree of $(k+1)$ -fold exponential size. In Section 7 we will prove that compositions of nondeterministic tt’s also form a hierarchy, with the same counter-examples. One of our aims is to show that these hierarchies collapse for functions of linear size increase, i.e., that $\mbox{\sf TT}^{k}\cap\mbox{\sf LSIF}\subseteq\mbox{\sf dTT}$ for every $k\geq 1$ .

We end this section by discussing some syntactic properties of tt’s. First, for an arbitrary tt it may always be assumed that its output rules only use the stay-instruction: an output rule $\langle q,\sigma,j,T\rangle\to\delta(\langle q_{1},\alpha_{1}\rangle,\dots,\langle q_{k},\alpha_{k}\rangle)$ can be replaced by the output rule $\langle q,\sigma,j,T\rangle\to\delta(\langle p_{1},{\rm stay}\rangle,\dots,\langle p_{k},{\rm stay}\rangle)$ and the move rules $\langle p_{i},\sigma,j,T\rangle\to\langle q_{i},\alpha_{i}\rangle$ for every $i\in[1,k]$ , where $p_{1},\dots,p_{k}$ are new states. This replacement preserves determinism and the sub-testing, local, top-down, and single-use properties (but not pruning or relabeling).

Second, we may always assume that the regular tests of a tt are disjoint. For a tt $M$ , let ${\cal T}_{M}$ be the set of regular tests in the left-hand sides of the rules of $M$ .

Lemma 8

For every tt $M$ there is an equivalent tt $M^{\prime}$ such that the tests in ${\cal T}_{M^{\prime}}$ are mutually disjoint. The construction preserves determinism and the sub-testing, local, top-down, single-use, pruning, and relabeling properties.

**Proof. **If $T,T^{\prime}\in{\cal T}_{M}$ and $T\cap T^{\prime}\neq\varnothing$ , then every rule $\langle q,\sigma,j,T\rangle\to\zeta$ can be replaced by the two rules $\langle q,\sigma,j,T\cap T^{\prime}\rangle\to\zeta$ and $\langle q,\sigma,j,T\setminus T^{\prime}\rangle\to\zeta$ . The transducer $M^{\prime}$ is obtained by repeating this procedure. $\Box$

Third, we can extend the definition of a tt $M=(\Sigma,\Delta,Q,q_{0},R)$ by allowing “general rules”, which can generate any finite number of output nodes, cf. [31, Lemma 2]. Simple examples of general rules are $\langle p,\tau,j\rangle\to\sigma(\langle p,{\rm up}\rangle,j)$ in Example 4 and $\langle d,e,0\rangle\to\sigma(e,e)$ in Example 5. Formally, a general rule is of the form $\langle q,\sigma,j,T\rangle\to\zeta$ such that $\zeta$ is a tree in $T_{\Delta}(Q\times I_{\sigma,j})$ , where $I_{\sigma,j}$ is the usual set of instructions: ${\rm stay}$ , ${\rm up}$ (provided $j\neq 0$ ), and ${\rm down}_{i}$ with $i\in[1,\operatorname{rank}(\sigma)]$ . If this rule is applicable to a configuration $\langle q,u\rangle$ of $M$ on $t\in T_{\Sigma}$ , then $G_{M,t}$ has the rule $\langle q,u\rangle\to\zeta_{u}$ , where $\zeta_{u}$ is obtained from $\zeta$ by changing every label $\langle q^{\prime},\alpha\rangle$ into $\langle q^{\prime},\alpha(u)\rangle$ . It is easy to see that a general rule can be replaced by the set of ordinary rules defined as follows. Let $p_{u}$ be a new state for every $u\in{\cal N}(\zeta)$ . Then the rules are $\langle q,\sigma,j,T\rangle\to\langle p_{\varepsilon},{\rm stay}\rangle$ , where $\varepsilon$ is the root of $\zeta$ , and all rules $\langle p_{u},\sigma,j,T\rangle\to\lambda(\langle p_{u1},{\rm stay}\rangle,\dots,\langle p_{uk},{\rm stay}\rangle)$ where $\lambda$ is the label of $u$ in $\zeta$ and $k$ is its rank. The first rule is a move rule that just changes state, and the latter rules output the $\Delta$ -labeled nodes of $\zeta$ one by one ( $\lambda\in\Delta$ ), and then make the required moves ( $\lambda\in Q\times I_{\sigma,j}$ ). This construction preserves determinism and the sub-testing, local, top-down, and single-use properties. Note that the classical top-down tree transducer has general rules.

If we allow general rules, then the stay-instruction is not needed any more in finitary tt’s. Let us say that a tt is stay-free if it does not use the stay-instruction in its rules. For every tt $M$ (with general rules) we can construct an equivalent stay-free tt $M_{\mathrm{sf}}$ with general rules, with possibly infinitely many rules but such that the right-hand sides of rules with the same left-hand side form a regular tree language. If $M$ is finitary, then we can transform $M_{\mathrm{sf}}$ into an equivalent stay-free tt with finitely many rules. The construction is as follows, where we may assume that the node tests in ${\cal T}_{M}$ are mutually disjoint, by (the proof of) Lemma 8.

For every left-hand side $\langle q,\sigma,j,T\rangle$ of a rule of $M=(\Sigma,\Delta,Q,Q_{0},R)$ we define a regular tree grammar $G_{q,\sigma,j,T}$ that simulates the computations of $M$ , starting in a configuration $\langle q,u\rangle$ to which $\langle q,\sigma,j,T\rangle$ is applicable, without leaving the current node $u$ , i.e., executing stay-instructions only. Its set of nonterminals is $\{\langle q^{\prime},{\rm stay}\rangle\mid q^{\prime}\in Q\}$ with initial nonterminal $\langle q,{\rm stay}\rangle$ . Its set of terminals is $\Delta\cup D_{\sigma,j}$ , where $D_{\sigma,j}=Q\times(I_{\sigma,j}\setminus\{{\rm stay}\})$ each element of which has rank 0. Finally, if $\langle q^{\prime},\sigma,j,T\rangle\to\zeta$ is a rule of $M$ (with $q^{\prime}\in Q$ and the same $\sigma$ , $j$ , and $T$ ), then $G_{q,\sigma,j,T}$ has the rule $\langle q^{\prime},{\rm stay}\rangle\to\zeta$ .

We now define $M_{\mathrm{sf}}=(\Sigma,\Delta,Q,Q_{0},R_{\mathrm{sf}})$ where $R_{\mathrm{sf}}$ consists of all general rules $\langle q,\sigma,j,T\rangle\to\zeta$ such that $\zeta\in L(G_{q,\sigma,j,T})$ , for every left-hand side $\langle q,\sigma,j,T\rangle$ of a rule of $M$ . Even if $M_{\mathrm{sf}}$ has infinitely many rules, it should be clear that (with all the definitions as in the finite case) $M_{\mathrm{sf}}$ is equivalent to $M$ .

Note that if $M$ is deterministic, then so is $M_{\mathrm{sf}}$ , because $G_{q,\sigma,j,T}$ is forward deterministic and hence $L(G_{q,\sigma,j,T})$ is empty or a singleton. Thus, $M_{\mathrm{sf}}$ has finitely many rules.

Assume now that $M$ , and hence $M_{\mathrm{sf}}$ , is finitary. Let $\langle q,\sigma,j,T\rangle$ be the left-hand side of a rule of $M$ , and let $D\subseteq D_{\sigma,j}$ . If $M_{\mathrm{sf}}$ has infinitely many rules $\langle q,\sigma,j,T\rangle\to\zeta$ with $\zeta\in T_{\Delta}(D)$ , then we remove those rules from $R_{\mathrm{sf}}$ . In fact, if $M_{\mathrm{sf}}$ would have a computation $\langle q_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M_{\mathrm{sf}},t}s$ with $q_{0}\in Q_{0}$ in which one of those rules is applied, then it would have a similar computation (with the same $q_{0}$ and $t$ , but, in general, another $s$ ) in which any other of those rules is applied. Since $s$ contains at least as many occurrences of symbols in $\Delta$ as $\zeta$ , that would contradict the finitariness of $M_{\mathrm{sf}}$ . Removing all these rules, for every $D\subseteq D_{\sigma,j}$ , we are left with an equivalent version of $M_{\mathrm{sf}}$ with finitely many rules. The construction is effective because $L(G_{q,\sigma,j,T})\cap T_{\Delta}(D)$ is a regular tree language and hence its finiteness can be decided.

The above constructions also preserve the sub-testing, local, top-down, and single-use properties. Note that if $M$ is a finitary tt ${}^{\hskip 1.13791pt\mathrm{s}}_{\downarrow}$ or tt ${}^{\ell}_{\downarrow}$ , then $M_{\mathrm{sf}}$ is a classical top-down tree transducer (after incorporating the child number in its finite state), with or without regular look-ahead, respectively.

4 Regular Look-Around

In this section we discuss some basic properties of tt’s with respect to the feature of regular look-around. We start with the simple fact that the domain of a tt can always be restricted to a regular tree language, except when the tt is local.

Lemma 9

For every tt $M$ and every $L\in\mbox{\sf REGT}$ there is a tt $M^{\prime}$ such that $\tau_{M^{\prime}}=\{(t,s)\in\tau_{M}\mid t\in L\}$ . The construction preserves determinism and the sub-testing, top-down, single-use, pruning, and relabeling properties.

**Proof. **The tt $M^{\prime}$ simulates $M$ , but additionally verifies that the input tree $t$ is in $L$ , by using the regular sub-test $T(L)$ at the root of $t$ . Formally, $M^{\prime}$ is obtained from $M$ by changing every rule $\langle q_{0},\sigma,0,T\rangle\to\zeta$ into $\langle q_{0},\sigma,0,T\cap T(L)\rangle\to\zeta$ , for every initial state $q_{0}$ . $\Box$

In the remainder of this section we show how to separate the regular look-around from a tt, by incorporating it into another tt. We first prove that every tt $M$ can be decomposed into a deterministic relabeling tt $N$ and a local tt $M^{\prime}$ . The relabeling tt $N$ preprocesses the input tree $t$ by adding to the label of each node $u$ of $t$ the truth values of the regular tests of $M$ at that node. This allows $M^{\prime}$ , during its simulation of $M$ , to inspect the new label of $u$ instead of testing $u$ . The idea is similar to that of removing regular look-ahead in [20, Theorem 2.6]. The translation realized by $N$ is called an mso relabeling in [7, 14] and [29, Section 4].

Lemma 10

$\mbox{\sf TT}\subseteq\mbox{\sf dTT$ {}_{\mathrm{rel}} $}\circ\mbox{\sf TT$ {}^{\ell} $}$ , i.e., for every tt $M$ there are a deterministic relabeling tt $N$ and a local tt $M^{\prime}$ such that $\tau_{N}\circ\tau_{M^{\prime}}=\tau_{M}$ . The construction preserves determinism, the top-down property, and the pruning property.

**Proof. **Let $M=(\Sigma,\Delta,Q,Q_{0},R)$ be a tt, and let ${\cal T}$ be the set of regular tests in the left-hand sides of the rules in $R$ . By Lemma 8 we may assume that the tests in ${\cal T}$ are mutually disjoint. Now let ${\cal T}_{\bot}={\cal T}\cup\{\bot\}$ where $\bot$ is the intersection of the complements of the tests in ${\cal T}$ . Thus, for every $t\in T_{\Sigma}$ and $u\in{\cal N}(t)$ , $(t,u)$ belongs to a unique node test in ${\cal T}_{\bot}$ . Let $\Sigma\times{\cal T}_{\bot}$ be the ranked alphabet such that $\langle\sigma,T\rangle$ has the same rank as $\sigma$ .

We define the relabeling tt $N=(\Sigma,\Sigma\times{\cal T}_{\bot},\{p\},p,R_{N})$ such that for every $\sigma\in\Sigma$ , $j\in[0,{\mathit{m}x}_{\Sigma}]$ , and $T\in{\cal T}_{\bot}$ , the output rule

[TABLE]

is in $R_{N}$ , where $m$ is the rank of $\sigma$ . Additionally we define the local tt $M^{\prime}=(\Sigma\times{\cal T}_{\bot},\Delta,Q,Q_{0},R^{\prime})$ with the following rules. If $\langle q,\sigma,j,T\rangle\to\zeta$ is a rule in $R$ , then $R^{\prime}$ contains the rule $\langle q,\langle\sigma,T\rangle,j\rangle\to\zeta$ . Note that $N$ is total and deterministic. Also, if $M$ is deterministic, then so is $M^{\prime}$ . It should be clear that $\tau_{M^{\prime}}(\tau_{N}(t))=\tau_{M}(t)$ for every $t\in T_{\Sigma}$ , i.e., $\tau_{N}\circ\tau_{M^{\prime}}=\tau_{M}$ . $\Box$

We will also need a variant of this lemma, for nondeterministic tt’s only.

Lemma 11

$\mbox{\sf TT$ {}^{\hskip 1.13791pt\mathrm{s}} $}\subseteq\mbox{\sf TT$ {}^{\ell}{\mathrm{rel}} $}\circ\mbox{\sf TT$ {}^{\ell} $}$ * and $\mbox{\sf TT$ {}^{\hskip 1.13791pt\mathrm{s}}{\mathrm{pru}} $}\subseteq\mbox{\sf TT$ {}^{\ell}{\mathrm{rel}} $}\circ\mbox{\sf TT$ {}^{\ell}{\mathrm{pru}} $}$ .*

**Proof. **Let $M=(\Sigma,\Delta,Q,Q_{0},R)$ be a sub-testing tt, and let ${\cal T}$ be the set of regular tests in the left-hand sides of the rules in $R$ . As in the proof of Lemma 10 we may assume that the tests in ${\cal T}$ are mutually disjoint (by Lemma 8), and we define ${\cal T}_{\bot}={\cal T}\cup\{\bot\}$ as in that proof. Let ${\cal T}_{\bot}=\{T(L_{1}),\dots,T(L_{n})\}$ where $L_{1},\dots,L_{n}$ are regular tree languages. Clearly, there is a bottom-up finite-state tree automaton $A=(\Sigma,P,F,\delta)$ (where $F$ is irrelevant) and a partition $\{F_{1},\dots,F_{n}\}$ of $P$ such that for every $t\in T_{\Sigma}$ and $i\in[1,n]$ , $t\in L_{i}$ if and only if $\delta(t)\in F_{i}$ . We define the local relabeling tt $N=(\Sigma,\Sigma\times{\cal T}_{\bot},P,P,R_{N})$ such that it nondeterministically simulates $A$ top-down. For every $\sigma\in\Sigma$ of rank $m$ , every sequence of states $p_{1},\dots,p_{m}\in P$ , and every $j\in[0,{\mathit{m}x}_{\Sigma}]$ , if $\delta(\sigma,p_{1},\dots,p_{m})=p\in F_{i}$ , then $R_{N}$ contains the rule $\langle p,\sigma,j\rangle\to\langle\sigma,T(L_{i})\rangle(\langle p_{1},{\rm down}_{1}\rangle,\dots,\langle p_{m},{\rm down}_{m}\rangle)$ . The local tt $M^{\prime}$ is defined as in the proof of Lemma 10. $\Box$

The next lemma is based on the folklore technique of computing the states of a bottom-up finite-state tree automaton that are “successful” at the current node (see, e.g., the proofs of [7, Theorem 10] and [6, Theorem 8]). The lemma shows that every top-down tt is equivalent to one that is sub-testing, and hence to a classical top-down tree transducer with regular look-ahead if it is finitary. It is a slight generalization of the fact that every mso relabeling can be computed by a top-down tree transducer with regular look-ahead, as shown in [7, Theorem 10] and [31, Theorem 4.4].

Lemma 12

$\mbox{\sf TT$ {}{\downarrow} $}=\mbox{\sf TT$ {}^{\hskip 1.13791pt\mathrm{s}}{\downarrow} $}$ . The construction preserves determinism, pruning, and relabeling.

**Proof. **Let $M=(\Sigma,\Delta,Q,Q_{0},R)$ be a tt↓ that uses a regular test $T$ over $\Sigma$ in its rules. For simplicity we first assume that $M$ uses $T$ in each of its rules. Let $A=(\Sigma\times\{0,1\},P,F,\delta)$ be a bottom-up finite-state tree automaton that recognizes $\operatorname{mark}(T)$ . We identify the symbols $(\sigma,0)$ and $\sigma$ ; thus, $A$ can also handle trees over $\Sigma$ . For every tree $t\in T_{\Sigma}$ and every node $u\in{\cal N}(t)$ , we define the set $\mathrm{succ}_{t}(u)$ of successful states of $A$ at $u$ to consist of all states $p\in P$ such that $A$ recognizes $t$ when started at $u$ in state $p$ . To be precise, $\mathrm{succ}_{t}(\mathrm{root}_{t})=F$ and if $u$ has label $\sigma\in\Sigma^{(m)}$ and $i\in[1,m]$ , then $\mathrm{succ}_{t}(ui)$ is the set of all states $p\in P$ such that $\delta(\sigma,p_{1},\dots,p_{i-1},p,p_{i+1},\dots,p_{m})\in\mathrm{succ}_{t}(u)$ , where $p_{j}=\delta(t|_{uj})$ , i.e., $p_{j}$ is the state in which ${\cal A}$ arrives at the $j$ -th child of $u$ , for every $j\in[1,m]\setminus\{i\}$ . Obviously, $\operatorname{mark}(t,u)$ is recognized by $A$ if and only if $\delta((\sigma,1),\delta(t|_{u1}),\dots,\delta(t|_{um}))\in\mathrm{succ}_{t}(u)$ .

For every $\sigma\in\Sigma^{(m)}$ and every sequence of states $p_{1},\dots,p_{m}\in P$ let $L_{\sigma,p_{1},\dots,p_{m}}$ be the regular tree language consisting of all trees $\sigma(t_{1},\dots,t_{m})\in T_{\Sigma}$ such that $\delta(t_{i})=p_{i}$ for every $i\in[1,m]$ . Thus, the regular sub-test $T(L_{\sigma,p_{1},\dots,p_{m}})$ verifies that $A$ arrives at the $i$ -th child of the current node in state $p_{i}$ for every $i\in[1,m]$ .

We construct a sub-testing tt↓ $M^{\prime}=(\Sigma,\Delta,Q^{\prime},Q^{\prime}_{0},R^{\prime})$ that is equivalent to $M$ . It keeps track of $\mathrm{succ}_{t}(u)$ in its finite state. Its set of states is $Q^{\prime}=Q\times\{S\mid S\subseteq P\}$ with set of initial states $Q^{\prime}_{0}=\{(q_{0},F)\mid q_{0}\in Q_{0}\}$ . The set of rules $R^{\prime}$ is defined as follows. Let $\langle q,\sigma,j,T\rangle\to\zeta$ be a rule in $R$ , let $S\subseteq P$ , and let $p_{1},\dots,p_{m}\in P$ such that $\delta((\sigma,1),p_{1},\dots,p_{m})\in S$ where $m=\operatorname{rank}_{\Sigma}(\sigma)$ . Then $R^{\prime}$ contains the rule $\langle(q,S),\sigma,j,T(L_{\sigma,p_{1},\dots,p_{m}})\rangle\to\zeta^{\prime}$ where $\zeta^{\prime}$ is obtained from $\zeta$ by changing every $\langle q^{\prime},{\rm stay}\rangle$ into $\langle(q^{\prime},S),{\rm stay}\rangle$ and every $\langle q^{\prime},{\rm down}_{i}\rangle$ into $\langle(q^{\prime},S_{i}),{\rm down}_{i}\rangle$ with $S_{i}=\{p\in P\mid\delta(\sigma,p_{1},\dots,p_{i-1},p,p_{i+1},\dots,p_{m})\in S\}$ .

In the general case where $M$ uses regular tests $T_{1},\dots,T_{n}$ , the transducer $M^{\prime}$ must keep track of $\mathrm{succ}_{t}(u)$ for each of the corresponding bottom-up finite-state tree automata $A_{1},\dots,A_{n}$ . $\Box$

The proof of Lemma 12 also shows that in a rule $\langle q,\sigma,j,T(L)\rangle\to\zeta$ of a sub-testing tt↓ we may assume that $L$ is of the form $L=\sigma(L_{1},\dots,L_{m})=\{\sigma(t_{1},\dots,t_{m})\mid t_{1}\in L_{1},\dots,t_{m}\in L_{m}\}$ for regular tree languages $L_{1},\dots,L_{m}$ (where $m=\operatorname{rank}(\sigma)$ ). This is how regular look-ahead is usually defined for classical top-down tree transducers.

By Lemmas 10 and 12, $\mbox{\sf dTT}\subseteq\mbox{\sf dTT$ {}^{\hskip 1.13791pt\mathrm{s}}{\downarrow} $}\circ\mbox{\sf dTT$ {}^{\ell} $}$ . It is proved in [28, Lemmas 49 and 50] that even $\mbox{\sf dTT}\subseteq\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}\circ\mbox{\sf dTT$ {}^{\ell} $}$ , but this will not be needed in what follows.555In [28], dTT and dTTℓ are denoted by dTT ${}^{\text{{\sc mso}}}$ and dTT, respectively.

Using Lemmas 10 and 12 we can now prove three essential properties of tt’s, based on well-known results from the literature.

Lemma 13

The regular tree languages are closed under inverses of tt translations, i.e., if $L\in\mbox{\sf REGT}$ and $\tau\in\mbox{\sf TT}$ , then $\tau^{-1}(L)\in\mbox{\sf REGT}$ .

**Proof. **Since the inverse of a composition is the composition of the inverses, it suffices to show this for dTT ${}^{\hskip 1.13791pt\mathrm{s}}_{\mathrm{rel}}$ and TTℓ by Lemmas 10 and 12. For dTT ${}^{\hskip 1.13791pt\mathrm{s}}_{\mathrm{rel}}$ it follows from [20, Theorem 2.6 and Lemma 1.2], and for TTℓ it is proved in [26, Lemma 3].666We note that an alternative proof is by Lemma 26 (in Section 6) and [34, Theorem 7.4] (see also [65, Section 5]). For the reader familiar with mso translations, see [14], we note that it is proved in [29, Section 4] that dTT ${}^{\hskip 1.13791pt\mathrm{s}}_{\mathrm{rel}}$ is the class of mso (tree) relabelings, and that REGT, which is the class of mso definable tree languages, is closed under inverse mso (tree) transductions by [14, Corollary 7.12].

$\Box$

Corollary 14

The domain of a tt $M$ is regular, i.e., $\mathrm{dom}(M)\in\mbox{\sf REGT}$ . More generally, for every $k\geq 1$ , if $\tau\in\mbox{\sf TT}^{k}$ then $\mathrm{dom}(\tau)\in\mbox{\sf REGT}$ .

Corollary 14 was proved for (nondeterministic) attributed tree transducers in [5], from which it is easy to conclude that Lemma 13 holds for attributed tree transducers, as explained in [26, Lemma 3].

Lemma 15

The regular tree languages are closed under pruning tt translations, i.e., if $L\in\mbox{\sf REGT}$ and $\tau\in\mbox{\sf TT$ {}_{\mathrm{pru}} $}$ , then $\tau(L)\in\mbox{\sf REGT}$ .

**Proof. **By Lemma 12, $\mbox{\sf TT$ {}{\mathrm{pru}} $}=\mbox{\sf TT$ {}^{\hskip 1.13791pt\mathrm{s}}{\mathrm{pru}} $}$ . As observed before, every $\tau\in\mbox{\sf TT$ {}^{\hskip 1.13791pt\mathrm{s}}_{\mathrm{pru}} $}$ can be realized by a classical linear top-down tree transducer with regular look-ahead. It is well known that, due to linearity, REGT is closed under such translations, see, e.g., [43, Corollary IV.6.7]. $\Box$

Lemma 13, Corollary 14 and Lemma 15 are powerful technical tools because they allow us to show that certain node tests of a tt $M$ are regular by defining them in terms of, e.g., the domains of other tt’s or of variants of $M$ itself. In other words, a tt can use tt’s “to look around”. For instance, Lemma 13 is used for this purpose in the proof of Lemma 16 below, where we show the following.

In a composition of a dtt with a sub-testing tt the second transducer can even be assumed to be local, because the first transducer can determine the truth values of the regular sub-tests of the output tree by executing appropriate regular tests on its input tree.

Lemma 16

$\mbox{\sf dTT}\circ\mbox{\sf TT$ {}^{\hskip 1.13791pt\mathrm{s}} $}\subseteq\mbox{\sf dTT}\circ\mbox{\sf TT$ {}^{\ell} $}$ . The construction preserves determinism (of the second transducer) and the top-down, single-use, pruning, and relabeling properties of both transducers.

**Proof. **Let $M_{1}=(\Sigma,\Delta,Q,q_{0},R)$ be a dtt and let $M_{2}$ be a sub-testing tt with input alphabet $\Delta$ . We will construct a dtt $M^{\prime}_{1}$ and a local tt $M^{\prime}_{2}$ that simulate the composition of $M_{1}$ and $M_{2}$ . The construction preserves the top-down, single-use, pruning, and relabeling property of each transducer, i.e., if $M_{1}$ has one of these properties, then so has $M^{\prime}_{1}$ , and similarly for $M_{2}$ and $M^{\prime}_{2}$ . Moreover, if $M_{2}$ is deterministic, then so is $M^{\prime}_{2}$ .

Let $(t,s)\in\tau_{M_{1}}$ . The dtt $M^{\prime}_{1}$ simulates $M_{1}$ on the input tree $t$ . Simultaneously it executes the sub-tests of $M_{2}$ at every node $v$ of the output tree $s$ and preprocesses $s$ by adding to the label of $v$ the truth values of these sub-tests at $v$ , cf. the text before Lemma 10. This allows $M^{\prime}_{2}$ , during its simulation of $M_{2}$ on $s$ , to inspect the new label of $v$ instead of sub-testing $v$ .

Every node of $s$ is produced by an output rule of $M_{1}$ during its computation on $t$ . Let $\bar{s}$ be an output form of $M_{1}$ on $t$ , and let $v$ be a leaf of $\bar{s}$ with label $\langle q,u\rangle$ . It should be clear that $\langle q,u\rangle\Rightarrow^{*}_{M_{1},t}s|_{v}$ . Now let $L$ be a regular tree language over $\Delta$ such that $M_{2}$ uses the sub-test $T^{\prime}=T(L)$ . We claim that, in configuration $\langle q,u\rangle$ , $M^{\prime}_{1}$ can test whether $(s,v)\in T^{\prime}$ by a regular test $\operatorname{inv}_{q}(T^{\prime})$ . Note that $(s,v)\in T(L)$ if and only if $s|_{v}\in L$ . Thus, $\operatorname{inv}_{q}(T^{\prime})$ should test whether the output tree generated by the configuration $\langle q,u\rangle$ is in $L$ . To prove that $\operatorname{mark}(\operatorname{inv}_{q}(T^{\prime}))$ is regular, we define a dtt $N_{q}$ such that $\operatorname{mark}(\operatorname{inv}_{q}(T^{\prime}))=\tau_{N_{q}}^{-1}(L)$ and we use Lemma 13. The transducer $N_{q}$ first uses a regular test at the root to verify that the input tree is of the form $\operatorname{mark}(t,u)$ .777To be precise, the regular sub-test $T(\operatorname{mark}(T^{\bullet}_{\Sigma}))$ .

After that it walks to the (unique) marked node $u$ , using move rules to execute a depth-first search of the input tree, and then simulates $M_{1}$ starting in state $q$ at $u$ , producing the output tree $s|_{v}$ . During that simulation it treats each symbol $(\sigma,0)$ or $(\sigma,1)$ as $\sigma$ , and for each regular test $T$ of $M_{1}$ it instead uses the test $\mu(T)$ , which is the set of all $(\operatorname{mark}(t,u),v)$ such that $(t,v)\in T$ and $u\in{\cal N}(t)$ , see Section 2.

The construction of $M^{\prime}_{1}$ and $M^{\prime}_{2}$ is similar to the construction of $N$ and $M^{\prime}$ in the proof of Lemma 10. Let ${\cal T}$ be the set of regular tests in the left-hand sides of the rules of $M_{2}$ . As in the proof of Lemma 10 we may assume that the tests in ${\cal T}$ are mutually disjoint (by Lemma 8), and we define ${\cal T}_{\bot}={\cal T}\cup\{\bot\}$ as in that proof. Note that the elements of ${\cal T}_{\bot}$ are still regular sub-tests. Note also that for every $q\in Q$ , $t\in\mathrm{dom}(M_{1})$ and $u\in{\cal N}(t)$ , $(t,u)$ belongs to a unique regular test in $\{\operatorname{inv}_{q}(T^{\prime})\mid T^{\prime}\in{\cal T}_{\bot}\}$ .

We define the dtt $M^{\prime}_{1}=(\Sigma,\Delta\times{\cal T}_{\bot},Q,q_{0},R^{\prime})$ such that $R^{\prime}$ contains all move rules in $R$ , and moreover, if $\langle q,\sigma,j,T\rangle\to\delta(\langle q_{1},\alpha_{1}\rangle,\dots,\langle q_{k},\alpha_{k}\rangle)$ is an output rule in $R$ , then $R^{\prime}$ contains the rule

[TABLE]

for every $T^{\prime}\in{\cal T}_{\bot}$ . We define the local tt $M^{\prime}_{2}$ with input alphabet $\Delta\times{\cal T}_{\bot}$ and the following rules. If $\langle q,\delta,j,T^{\prime}\rangle\to\zeta$ is a rule of $M_{2}$ , then $M^{\prime}_{2}$ has the rule $\langle q,\langle\delta,T^{\prime}\rangle,j\rangle\to\zeta$ . It should now be clear that $\tau_{M_{2}^{\prime}}(\tau_{M_{1}^{\prime}}(t))=\tau_{M_{2}}(\tau_{M_{1}}(t))$ for every $t\in T_{\Sigma}$ , i.e., $\tau_{M^{\prime}_{1}}\circ\tau_{M^{\prime}_{2}}=\tau_{M_{1}}\circ\tau_{M_{2}}$ . If $M_{1}$ is single-use, then $M^{\prime}_{1}$ is also single-use, because $M^{\prime}_{1}$ visits the nodes of the input tree in the same states as $M_{1}$ ; the same is true for $M_{2}$ and $M^{\prime}_{2}$ . Preservation of the other properties easily follows from the construction of $M^{\prime}_{1}$ and $M^{\prime}_{2}$ . $\Box$

5 Composition

In this section we prove three composition results for tt’s. Our first aim is to prove that dtt’s are closed under right-composition with top-down dtt’s, and hence in particular with pruning dtt’s. As already mentioned at the end of the Introduction, this generalizes the result of [38, Theorem 4.3] for attributed tree transducers, because dtt’s need not be total and they have regular look-around. By Lemma 12 we may assume that the top-down tt is sub-testing. It may even be assumed to be local by Lemma 16.

Lemma 17

$\mbox{\sf dTT}\circ\mbox{\sf dTT$ {}^{\ell}_{\downarrow} $}\subseteq\mbox{\sf dTT}$ . In particular

[TABLE]

**Proof. **Since the domain of a tt can always be restricted to $\mathrm{dom}(M_{1})$ by Lemma 9 and Corollary 14, it suffices to show that for every dtt $M_{1}$ and every local top-down dtt $M_{2}$ , a dtt $M$ can be constructed such that $\tau_{M}(t)=\tau_{M_{2}}(\tau_{M_{1}}(t))$ for every input tree $t\in\mathrm{dom}(M_{1})$ . For the case where $M_{1}$ is also local this construction was presented in the proof of [28, Theorem 55], which can easily be adapted to the general case. We repeat it here for completeness sake, and because the proofs of the other two composition closure results will be based on it.

The transducer $M$ is obtained by a straightforward product construction. For every $(t,s)\in\tau_{M_{1}}$ , $M$ simulates $M_{1}$ on the input tree $t$ until $M_{1}$ uses an output rule that generates a node $v$ of $s$ . Then $M$ switches to the simulation of $M_{2}$ on $v$ , as long as $M_{2}$ executes stay-instructions. When $M_{2}$ executes a ${\rm down}_{i}$ -instruction, $M$ switches again to the simulation of $M_{1}$ in order to generate the $i$ -th child of $v$ .

Formally, let $M_{1}=(\Sigma,\Delta,P,p_{0},R_{1})$ and $M_{2}=(\Delta,\Gamma,Q,q_{0},R_{2})$ . To simplify the construction of $M$ we assume that $M_{1}$ keeps track in its finite state of the child number of the output node to be generated. To be precise, we assume that there is a mapping $\chi:P\to[0,{\mathit{m}x}_{\Delta}]$ such that for every output form $s^{\prime}$ and every leaf $v$ of $s^{\prime}$ that is labeled by a configuration $\langle p,u\rangle$ , the child number of $v$ in $s^{\prime}$ is $\chi(p)$ . That is possible because the output tree is generated top-down. If $M_{1}$ does not satisfy this assumption, then we change $M_{1}$ as follows. The new set of states is $P\times[0,{\mathit{m}x}_{\Delta}]$ , and we define $\chi(p,i)=i$ . The new initial state is $(p_{0},0)$ , because $M_{1}$ starts by generating the root of the output tree. Each move rule $\langle p,\sigma,j,T\rangle\to\langle p^{\prime},\alpha\rangle$ of $M_{1}$ is changed into the rules $\langle(p,i),\sigma,j,T\rangle\to\langle(p^{\prime},i),\alpha\rangle$ and each output rule $\langle p,\sigma,j,T\rangle\to\delta(\langle p_{1},\alpha_{1}\rangle,\dots,\langle p_{k},\alpha_{k}\rangle)$ into $\langle(p,i),\sigma,j,T\rangle\to\delta(\langle(p_{1},1),\alpha_{1}\rangle,\dots,\langle(p_{k},k),\alpha_{k}\rangle)$ , for every $i\in[0,{\mathit{m}x}_{\Delta}]$ . For the sake of the proof of Lemma 22 we note that this transformation of $M_{1}$ preserves the single-use property, because we have only added information to the states of $M_{1}$ .

The dtt $M$ has input alphabet $\Sigma$ and output alphabet $\Gamma$ . Its states are of the form $(p,q)$ or $(\rho,q)$ , where $p\in P$ , $q\in Q$ , and $\rho$ is an output rule of $M_{1}$ , i.e., a rule of the form $\langle p,\sigma,j,T\rangle\to\delta(\langle p_{1},\alpha_{1}\rangle,\dots,\langle p_{k},\alpha_{k}\rangle)$ . Its initial state is $(p_{0},q_{0})$ . A state $(p,q)$ is used by $M$ to simulate the computation of $M_{1}$ that generates the next current node of $M_{2}$ when $M_{2}$ moves down (keeping the state $q$ of $M_{2}$ in memory). Initially $M$ simulates the computation of $M_{1}$ that generates the root of the output tree. A state $(\rho,q)$ is used by $M$ to simulate the computation of $M_{2}$ on the node that $M_{1}$ has generated with rule $\rho$ . The rules of $M$ are defined as follows.

First, rules that simulate $M_{1}$ . Let $\rho:\langle p,\sigma,j,T\rangle\to\zeta$ be a rule in $R_{1}$ . If $\zeta=\langle p^{\prime},\alpha\rangle$ , then $M$ has the rules $\langle(p,q),\sigma,j,T\rangle\to\langle(p^{\prime},q),\alpha\rangle$ for every $q\in Q$ . If $\rho$ is an output rule, then $M$ has the rules $\langle(p,q),\sigma,j,T\rangle\to\langle(\rho,q),{\rm stay}\rangle$ for every $q\in Q$ .

Second, rules that simulate $M_{2}$ . Let $\langle q,\delta,i\rangle\to\zeta$ be a rule in $R_{2}$ and let $\rho\colon\langle p,\sigma,j,T\rangle\to\delta(\langle p_{1},\alpha_{1}\rangle,\dots,\langle p_{k},\alpha_{k}\rangle)$ be an output rule in $R_{1}$ , with the same $\delta$ and with $\chi(p)=i$ . Then $M$ has the rule $\langle(\rho,q),\sigma,j,T\rangle\to\zeta^{\prime}$ where $\zeta^{\prime}$ is obtained from $\zeta$ by changing every $\langle q^{\prime},{\rm stay}\rangle$ into $\langle(\rho,q^{\prime}),{\rm stay}\rangle$ , and every $\langle q^{\prime},{\rm down}_{\ell}\rangle$ into $\langle(p_{\ell},q^{\prime}),\alpha_{\ell}\rangle$ . Note that the test on $\sigma$ , $j$ , and $T$ is actually superfluous, because that was already tested when $M$ included $\rho$ in its state.

It is easy to see that $\tau_{M}(t)=\tau_{M_{2}}(\tau_{M_{1}}(t))$ for every input tree $t\in\mathrm{dom}(M_{1})$ . If the rules of $M_{2}$ do not contain stay-instructions, then $M$ does not need the states $(\rho,q)$ . Its rules can then be simplified as follows. Let $\langle p,\sigma,j,T\rangle\to\zeta$ be a rule in $R_{1}$ . As above, if $\zeta=\langle p^{\prime},\alpha\rangle$ , then $M$ has the rules $\langle(p,q),\sigma,j,T\rangle\to\langle(p^{\prime},q),\alpha\rangle$ for every $q\in Q$ . If $\zeta=\delta(\langle p_{1},\alpha_{1}\rangle,\dots,\langle p_{k},\alpha_{k}\rangle)$ and $\langle q,\delta,i\rangle\to\zeta^{\prime}$ is a rule in $R_{2}$ , with the same $\delta$ and with $\chi(p)=i$ , then $M$ has the rule $\langle(p,q),\sigma,j,T\rangle\to\zeta^{\prime\prime}$ where $\zeta^{\prime\prime}$ is obtained from $\zeta^{\prime}$ by changing every $\langle q^{\prime},{\rm down}_{\ell}\rangle$ into $\langle(p_{\ell},q^{\prime}),\alpha_{\ell}\rangle$ . This shows that if both $M_{1}$ and $M_{2}$ are pruning, then $M$ is pruning too. $\Box$

We obtain our first composition closure result from Lemmas 12, 16, and 17. Note that the closure under composition of dTT↓ already follows from Lemma 12 and [20, Theorem 2.11(2)].

Theorem 18

$\mbox{\sf dTT}\circ\mbox{\sf dTT$ {}_{\downarrow} $}\subseteq\mbox{\sf dTT}$ . In particular, dTT↓ and dTTpru are closed under composition.

Theorem 18 can be used to show that in a composition of two dtt’s we may always assume that the second one is local (thus strengthening Lemma 16): by Lemma 10 the second tt can be decomposed into a top-down tt and a local tt, and then (by Theorem 18), the top-down one can be absorbed by the first tt. Hence $\mbox{\sf dTT}\circ\mbox{\sf dTT}\subseteq\mbox{\sf dTT}\circ\mbox{\sf dTT$ {}_{\downarrow} $}\circ\mbox{\sf dTT$ {}^{\ell} $}\subseteq\mbox{\sf dTT}\circ\mbox{\sf dTT$ {}^{\ell} $}$ . This was already proved in [28, Theorem 53] by means of pebble tree transducers.

Our second composition result generalizes Theorem 18 to nondeterministic tt’s, restricted to right-composition with pruning tt’s. The proof of the next lemma is similar to that of Lemma 17.

Lemma 19

$\mbox{\sf TT}\circ\mbox{\sf TT$ {}^{\ell}_{\mathrm{pru}} $}\subseteq\mbox{\sf TT}$ . In particular

[TABLE]

**Proof. **Let $M_{1}=(\Sigma,\Delta,P,P_{0},R_{1})$ be a tt and $M_{2}=(\Delta,\Gamma,Q,Q_{0},R_{2})$ a local pruning tt. The construction of the transducer $M$ such that $\tau_{M}=\tau_{M_{1}}\circ\tau_{M_{2}}$ is a straightforward variant of the one in the last paragraph of the proof of Lemma 17. This time, we do not verify at the start that the input tree is in the domain of $M_{1}$ , because it has to be checked at each step of $M$ that $M_{1}$ can produce an output tree, in particular when $M_{2}$ deletes part of that output tree (cf. the proof of [20, Lemma 2.9]).

We define $M=(\Sigma,\Gamma,P\times Q,P_{0}\times Q_{0},R)$ as follows. As in the proof of Lemma 17 we assume that $M_{1}$ keeps track in its finite state of the child number of the output node to be generated, through a mapping $\chi:P\to[0,{\mathit{m}x}_{\Sigma}]$ . Let $\langle p,\sigma,j,T\rangle\to\zeta$ be a rule in $R_{1}$ . As before, if $\zeta=\langle p^{\prime},\alpha\rangle$ , then $M$ has the rules $\langle(p,q),\sigma,j,T\rangle\to\langle(p^{\prime},q),\alpha\rangle$ for every $q\in Q$ . If $\zeta=\delta(\langle p_{1},\alpha_{1}\rangle,\dots,\langle p_{k},\alpha_{k}\rangle)$ and $\langle q,\delta,i\rangle\to\zeta^{\prime}$ is a rule in $R_{2}$ , with the same $\delta$ and with $\chi(p)=i$ , then $M$ has the rule $\langle(p,q),\sigma,j,T\cap T^{\prime}\rangle\to\zeta^{\prime\prime}$ where $\zeta^{\prime\prime}$ is obtained (as before) from $\zeta^{\prime}$ by changing every $\langle q^{\prime},{\rm down}_{\ell}\rangle$ into $\langle(p_{\ell},q^{\prime}),\alpha_{\ell}\rangle$ , and the node test $T^{\prime}$ consists of all $(t,u)$ such that for every $\ell\in[1,k]$ there exists a computation $\langle p_{\ell},\alpha_{\ell}(u)\rangle\Rightarrow^{*}_{M_{1},t}s_{\ell}$ for some $s_{\ell}\in T_{\Delta}$ . Thus, the only difference with the proof of Lemma 17 is the additional test $T^{\prime}$ . In fact, it suffices that $T^{\prime}$ tests every $\ell\in[1,k]$ for which ${\rm down}_{\ell}$ does not occur in $\zeta^{\prime}$ . That guarantees the existence of an output tree of $M_{1}$ on which $M_{2}$ is simulated by $M$ . It should be clear that $T^{\prime}$ is regular by Corollary 14: it can be written as $\bigcap_{\ell\in[1,k]}T^{\prime}_{\ell}$ where $\operatorname{mark}(T^{\prime}_{\ell})$ is the domain of a tt that walks to node $\alpha_{\ell}(u)$ and then simulates $M_{1}$ starting in state $p_{\ell}$ .

We note that this construction does not work for an arbitrary top-down $M_{2}$ without stay-instructions. If some ${\rm down}_{\ell}$ occurs twice in $\zeta^{\prime}$ , then there are two occurrences $\langle(p_{\ell},q^{\prime}),\alpha_{\ell}\rangle$ and $\langle(p_{\ell},q^{\prime\prime}),\alpha_{\ell}\rangle$ in $\zeta^{\prime\prime}$ and it is not guaranteed (as it should) that from both occurrences the same output subtree of $M_{1}$ is generated by $M$ . We finally note that, as in the proof of Lemma 17, if both $M_{1}$ and $M_{2}$ are pruning, then so is $M$ . $\Box$

We obtain our second composition result from Lemma 12, the second inclusion of Lemma 11, and two applications of Lemma 19 (taking into account that $\mbox{\sf TT$ {}^{\ell}{\mathrm{rel}} $}\subseteq\mbox{\sf TT$ {}^{\ell}{\mathrm{pru}} $}$ ).

Theorem 20

$\mbox{\sf TT}\circ\mbox{\sf TT$ {}{\mathrm{pru}} $}\subseteq\mbox{\sf TT}$ *. In particular $\mbox{\sf TT$ {}{\downarrow} $}\circ\mbox{\sf TT$ {}{\mathrm{pru}} $}\subseteq\mbox{\sf TT$ {}{\downarrow} $}$ , and TTpru is closed under composition.*

Hence, also in a composition of two nondeterministic tt’s we may always assume that the second one is local: $\mbox{\sf TT}\circ\mbox{\sf TT}\subseteq\mbox{\sf TT}\circ\mbox{\sf dTT$ {}_{\mathrm{rel}} $}\circ\mbox{\sf TT$ {}^{\ell} $}\subseteq\mbox{\sf TT}\circ\mbox{\sf TT$ {}^{\ell} $}$ by Lemma 10 and Theorem 20, respectively.

The range of a deterministic tt $M$ can be restricted to a regular tree language $L$ by restricting its domain to $\tau_{M}^{-1}(L)$ , using Lemmas 9 and 13. For a nondeterministic tt we can use the next corollary.

Corollary 21

The translation $\tau^{\prime}=\{(t,s)\in\tau\mid s\in L\}$ is in TT for every $\tau\in\mbox{\sf TT}$ and $L\in\mbox{\sf REGT}$ . If $\tau$ is in TT↓ or TTpru, then so is $\tau^{\prime}$ .

**Proof. **Let $\Sigma$ be the output alphabet of $\tau$ and let $A=(\Sigma,P,F,\delta)$ be a bottom-up finite-state tree automaton such that $L(A)=L$ . Obviously $\tau^{\prime}=\tau\circ\tau_{L}$ where $\tau_{L}$ is the identity on $L$ , and obviously $\tau_{L}\in\mbox{\sf TT$ {}^{\ell}_{\mathrm{rel}} $}$ : it is realized by the local relabeling tt $(\Sigma,\Sigma,P,F,R)$ where $R$ consists of all rules

[TABLE]

such that $\delta(\sigma,p_{1},\dots,p_{m})=p$ . By Theorem 20, $\tau^{\prime}$ satisfies the requirements. $\Box$

Our third composition result is that deterministic tt’s are closed under left-composition with (deterministic) single-use tt’s. This is a variant of one of the main results of [40, 41, 45] for (a variant of) attribute grammars, cf. the last paragraph of [7]. It is proved for attributed tree transducers in [56, Theorem 3] (see also [55, Satz 6.5]).

Lemma 22

$\mbox{\sf dTT$ {}_{\mathrm{su}} $}\circ\mbox{\sf dTT$ {}^{\ell} $}\subseteq\mbox{\sf dTT}$ .

**Proof. **Let $M_{1}=(\Sigma,\Delta,P,p_{0},R_{1})$ and $M_{2}=(\Delta,\Gamma,Q,q_{0},R_{2})$ be a single-use dtt and a local dtt, respectively. We extend the proof of Lemma 17 to the case that $M_{2}$ is an arbitrary local dtt. Thus, we have to deal with the fact that now $M_{2}$ can also move up on the output tree of $M_{1}$ . Let $(t,s)\in\tau_{M_{1}}$ , and let $d$ be the derivation tree of the computation $\langle p_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M_{1},t}s$ . Since $M_{1}$ is single-use, we can identify each node of $d$ that is labeled by a configuration with that configuration, because a configuration $\langle p,u\rangle$ of $M_{1}$ occurs at most once in $d$ . Suppose that $M_{1}$ , in configuration $\langle p,u\rangle$ on $t$ , has generated a node $v$ of $s$ . When $M_{2}$ executes an up-instruction at node $v$ , the new transducer $M$ has to backtrack on the computation of $M_{1}$ , back to the moment that the parent of $v$ in $s$ was generated by $M_{1}$ . Thus, starting with the configuration $\langle p,u\rangle$ of $M_{1}$ , $M$ has to determine the ancestors of $\langle p,u\rangle$ in $d$ , and stop at the first ancestor that is a configuration generating an output node. Since $M_{1}$ is single-use, each configuration $\langle p,u\rangle$ has a unique parent configuration $\langle p^{\prime},u^{\prime}\rangle$ in $d$ . That allows us to find $\langle p^{\prime},u^{\prime}\rangle$ by a regular test, as follows.

For every $p,p^{\prime}\in P$ and every instruction $\alpha$ of $M_{1}$ , we will define a regular test $T_{p,p^{\prime},\alpha}$ such that for every $t\in\mathrm{dom}(M_{1})$ and $u\in{\cal N}(t)$ , $(t,u)\in T_{p,p^{\prime},\alpha}$ if and only if $\langle p^{\prime},\alpha(u)\rangle$ is the parent of $\langle p,u\rangle$ in the derivation tree of the computation $\langle p_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M_{1},t}\tau_{M_{1}}(t)$ .888For the definition of $\alpha(u)$ see Section 3.

We will construct a tt $N$ and define $T_{p,p^{\prime},\alpha}=\{(t,u)\mid\operatorname{mark}(t,u)\in\mathrm{dom}(N)\}$ . Then $T_{p,p^{\prime},\alpha}$ is regular by Corollary 14. To be able to describe $N$ , we change notation and consider the node test $T_{\bar{p},\bar{p}^{\prime},\bar{\alpha}}$ for $\bar{p},\bar{p}^{\prime}\in P$ and instruction $\bar{\alpha}$ .

Let $M^{\prime}_{1}=(\Sigma,\varnothing,P,\{p_{0}\},R^{\prime}_{1})$ be the nondeterministic tt obtained from $M_{1}$ by changing every output rule $\langle p,\sigma,j,T\rangle\to\delta(\langle p_{1},\alpha_{1}\rangle,\dots,\langle p_{k},\alpha_{k}\rangle)$ into the move rules $\langle p,\sigma,j,T\rangle\to\langle p_{i},\alpha_{i}\rangle$ for every $i\in[1,k]$ . Intuitively, for an input tree $t\in\mathrm{dom}(M_{1})$ , the tree-walking automaton $M^{\prime}_{1}$ follows an arbitrary path in the unique derivation tree $d\in L(G^{\mathrm{der}}_{M_{1},t})$ , from the root of $d$ down to the leaves. Whenever $M_{1}$ branches, $M^{\prime}_{1}$ nondeterministically follows one of those branches. The transducer $N$ , which is a variant of $M^{\prime}_{1}$ , has states $(p,p^{\prime},\alpha)$ with $p,p^{\prime},\alpha$ as above. The initial state is $(p_{0},-,-)$ , with the second and third component fixed, but irrelevant (e.g., $(p_{0},p_{0},{\rm stay})$ ). On a tree $\operatorname{mark}(t,u)$ , $N$ uses the state $(p,p^{\prime},\alpha)$ to simulate the computations of $M^{\prime}_{1}$ in state $p$ on $t$ , but additionally keeps the previous configuration of $M^{\prime}_{1}$ in its finite state, as the pair $(p^{\prime},\alpha)$ . When it arrives at the marked node $u$ in state $(\bar{p},\bar{p}^{\prime},\bar{\alpha})$ , it outputs a symbol of rank 0. Formally, let $\langle p,\sigma,j,T\rangle\to\zeta$ be a rule in $R^{\prime}_{1}$ , let $p^{\prime}\in P$ , let $\alpha$ be an instruction, and let $b\in\{0,1\}$ . Then $N$ has the rule $\langle(p,p^{\prime},\alpha),(\sigma,b),j,\mu(T)\rangle\to\zeta^{\prime}$ where $\langle\tilde{p},{\rm down}_{i}\rangle^{\prime}=\langle(\tilde{p},p,{\rm up}),{\rm down}_{i}\rangle$ , $\langle\tilde{p},{\rm up}\rangle^{\prime}=\langle(\tilde{p},p,{\rm down}_{j}),{\rm up}\rangle$ , and $\langle\tilde{p},{\rm stay}\rangle^{\prime}=\langle(\tilde{p},p,{\rm stay}),{\rm stay}\rangle$ for every $\tilde{p}\in P$ and $i\in[1,\operatorname{rank}(\sigma)]$ . Additionally, $N$ has the rule $\langle(\bar{p},\bar{p}^{\prime},\bar{\alpha}),(\sigma,1),j,\mu(T)\rangle\to\top$ , where $\top$ is its unique output symbol, of rank 0. Thus, if the tree-walking automaton $N$ arrives in state $(\bar{p},\bar{p}^{\prime},\bar{\alpha})$ at the marked node $u$ , it can accept $\operatorname{mark}(t,u)$ . Hence, for every $t\in\mathrm{dom}(M_{1})$ , $N$ accepts $\operatorname{mark}(t,u)$ if and only if $\langle\bar{p}^{\prime},\bar{\alpha}(u)\rangle$ is the parent of $\langle\bar{p},u\rangle$ in the derivation tree of the computation $\langle p_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M_{1},t}\tau_{M_{1}}(t)$ .

The transducer $M$ is an extension of the one in the proof of Lemma 17. It additionally has states $\mathrm{back}^{1}_{p,q}$ and $\mathrm{back}^{*}_{p,q}$ to simulate the first and the following backward steps of the computation of $M_{1}$ . Its rules are obtained as follows. First, it has the same rules that simulate (the forward computation of) $M_{1}$ . Second, the rules of $M$ that simulate $M_{2}$ are extended in such a way that, to obtain $\zeta^{\prime}$ from $\zeta$ , one has to change additionally every $\langle q^{\prime},{\rm up}\rangle$ into $\langle\mathrm{back}^{1}_{p,q^{\prime}},{\rm stay}\rangle$ . Third, $M$ additionally has rules that simulate the backward computation of $M_{1}$ . For each state $\mathrm{back}^{1}_{p,q}$ it has all rules $\langle\mathrm{back}^{1}_{p,q},\sigma,j,T_{p,p^{\prime},\alpha}\rangle\to\langle\mathrm{back}^{*}_{p^{\prime},q},\alpha\rangle$ (where the tests on $\sigma$ and $j$ are irrelevant, because $M$ arrived in state $\mathrm{back}^{1}_{p,q}$ by a stay-instruction). For each state $\mathrm{back}^{*}_{p,q}$ it has the following rules. Let $\rho:\langle p,\sigma,j,T\rangle\to\zeta$ be a rule of $M_{1}$ . If $\rho$ is a move rule, then $M$ has all rules $\langle\mathrm{back}^{*}_{p,q},\sigma,j,T\cap T_{p,p^{\prime},\alpha}\rangle\to\langle\mathrm{back}^{*}_{p^{\prime},q},\alpha\rangle$ . If $\rho$ is an output rule, then $M$ has the rule $\langle\mathrm{back}^{*}_{p,q},\sigma,j,T\rangle\to\langle(\rho,q),{\rm stay}\rangle$ . $\Box$

Theorem 23

$\mbox{\sf dTT$ {}_{\mathrm{su}} $}\circ\mbox{\sf dTT}\subseteq\mbox{\sf dTT}$ .

**Proof. **It follows from Lemmas 10, 12, and 16 that

[TABLE]

Thus, by Lemma 22, it suffices to show that $\mbox{\sf dTT$ {}{\mathrm{su}} $}\circ\mbox{\sf dTT$ {}^{\ell}{\mathrm{rel}} $}\subseteq\mbox{\sf dTT$ {}_{\mathrm{su}} $}$ . For a single-use dtt $M_{1}$ and a local relabeling dtt $M_{2}$ , consider the construction of the dtt $M$ in the last paragraph of the proof of Lemma 17. It should be clear that $M$ is single-use: if $M_{1}$ visits an input node in state $p$ , then $M$ visits that node in state $(p,q)$ for some $q$ . $\Box$

It can be proved that dTTsu is closed under composition, which also follows from Proposition 29 in the next section. The inclusion $\mbox{\sf dTT$ {}{\mathrm{su}} $}\circ\mbox{\sf dTT$ {}^{\ell}{\mathrm{rel}} $}\subseteq\mbox{\sf dTT$ {}_{\mathrm{su}} $}$ in the previous proof is a special case of that.

6 Macro and MSO

In this section we collect some results on the connection between tt’s, macro tree transducers (in short mt’s) and mso tree transducers. They are taken from the literature or can easily be proved using results from the literature. This section can be skipped on first reading, except that the reader interested in linear size increase should glance at Corollaries 32 and 33.

6.1 Macro Tree Transducers

Let MT denote the class of translations realized by mt’s, with unrestricted or outside-in (oi) derivation mode, let dMT denote the subclass realized by deterministic mt’s, and let dtMT denote the class of total translations in dMT (see [34] where they are denoted by MT ${}_{\text{{\sc oi}}}$ , DMT ${}_{\text{{\sc oi}}}$ , and DtMT, respectively). We first consider the relationship between deterministic tt’s and mt’s.

It is proved in [28, Lemma 49 and Corollary 51] that $\mbox{\sf dTT}\subseteq\mbox{\sf dMT}$ , and in [14, Theorem 8.22] (see also [28, Corollary 51]) that $\mbox{\sf dMT}=\mbox{\sf dTT$ {}^{\ell}_{\downarrow} $}\circ\mbox{\sf dTT$ {}^{\ell} $}$ . Here we prove the following variant.

Lemma 24

$\mbox{\sf dTT}\subseteq\mbox{\sf dMT}=\mbox{\sf dTT$ {}_{\downarrow} $}\circ\mbox{\sf dTT}$ .

**Proof. **We first show that $\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dMT}\subseteq\mbox{\sf dMT}$ . By Lemma 12 it suffices to show that $\mbox{\sf dTT$ {}^{\hskip 1.13791pt\mathrm{s}}{\downarrow} $}\circ\mbox{\sf dMT}\subseteq\mbox{\sf dMT}$ . The inclusion $\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}\circ\mbox{\sf dMT}\subseteq\mbox{\sf dMT}$ is proved in [34, Theorem 7.6(3)]. As also argued before [32, Theorem 7.5], this implies the inclusion $\mbox{\sf dTT$ {}^{\hskip 1.13791pt\mathrm{s}}{\downarrow} $}\circ\mbox{\sf dMT}\subseteq\mbox{\sf dMT}$ as follows. By [20, Theorem 2.6] $\mbox{\sf dTT$ {}^{\hskip 1.13791pt\mathrm{s}}{\downarrow} $}\subseteq\mbox{\sf DBQREL}\circ\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}$ , where DBQREL is the class of deterministic bottom-up finite-state relabelings. Hence $\mbox{\sf dTT$ {}^{\hskip 1.13791pt\mathrm{s}}_{\downarrow} $}\circ\mbox{\sf dMT}\subseteq\mbox{\sf DBQREL}\circ\mbox{\sf dMT}$ . Since dMT is closed under regular look-ahead by [34, Theorem 6.15], it is straightforward to prove that $\mbox{\sf DBQREL}\circ\mbox{\sf dMT}\subseteq\mbox{\sf dMT}$ , similar to the proof of [34, Lemma 6.17].

By Lemma 10, $\mbox{\sf dTT}\subseteq\mbox{\sf dTT$ {}_{\downarrow} $}\circ\mbox{\sf dTT$ {}^{\ell} $}$ . It is proved in [31, Theorem 35 for $n=0$ ] that $\mbox{\sf dTT$ {}^{\ell} $}\subseteq\mbox{\sf dMT}$ .999By mistake, [31, Theorem 35] is stated for $n\geq 1$ only. It also holds for $n=0$ by [31, Lemma 34 and Theorem 31].

Hence $\mbox{\sf dTT}\subseteq\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dTT$ {}^{\ell} $}\subseteq\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dMT}\subseteq\mbox{\sf dMT}$ , which implies that $\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dTT}\subseteq\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dMT}\subseteq\mbox{\sf dMT}$ . It now remains to show that $\mbox{\sf dMT}\subseteq\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dTT}$ . It is proved in [31, Section 5.5] that $\mbox{\sf d$ {}{\mathrm{t}} $MT}\subseteq\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}\circ\mbox{\sf dTT$ {}^{\ell} $}$ . As shown in [34, Theorem 6.18], every translation $\tau\in\mbox{\sf dMT}$ is the restriction to a regular tree language $L$ of a translation $\tau^{\prime}\in\mbox{\sf d$ {}{\mathrm{t}} $MT}$ . Hence $\tau^{\prime}\in\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}\circ\mbox{\sf dTT$ {}^{\ell} $}$ and so $\tau\in\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dTT$ {}^{\ell} $}$ , because the first tt can start by verifying that the input tree is in $L$ with a regular test at the root of $t$ , by Lemma 9. $\Box$

From Lemma 24, together with Theorem 18, we obtain the following corollary on compositions.

Corollary 25

For every $k\geq 1$ , $\mbox{\sf dTT}^{k}\subseteq\mbox{\sf dMT}^{k}=\mbox{\sf dTT$ {}_{\downarrow} $}\circ\mbox{\sf dTT}^{k}\subseteq\mbox{\sf dTT}^{k+1}$ .

The above two inclusions are proper, cf. [39, Lemma 6.54] and [34, Theorem 4.16]. In fact, the macro tree transducer is, and can be, of exponential height increase [34, Theorem 3.24]. Hence $\tau_{\mathrm{exp}}^{k+1}$ is not in $\mbox{\sf dMT}^{k}$ , cf. the proof of Proposition 7. Also, $\tau_{M}^{k}$ is not in $\mbox{\sf dTT}^{k}$ where $M$ is an mt that translates $\tau^{n}a$ into $\tau^{2^{n}}a$ (with $\tau$ of rank 1 and $a$ of rank 0).

The relationship between nondeterministic tt’s and mt’s is less straightforward. On the one hand, even TT↓ is not included in MT because all macro tree translations are finitary. But we can express every tt as a composition of two top-down tt’s and an mt.

Lemma 26

$\mbox{\sf TT}\subseteq\mbox{\sf TT$ {}{\downarrow} $}\circ\mbox{\sf TT$ {}{\downarrow} $}\circ\mbox{\sf MT}$ .

**Proof. **By Lemma 10, $\mbox{\sf TT}\subseteq\mbox{\sf TT$ {}_{\downarrow} $}\circ\mbox{\sf TT$ {}^{\ell} $}$ . It follows from [31, Lemmas 34 and 27] that $\mbox{\sf TT$ {}^{\ell} $}\subseteq\mbox{\sf MON}\circ\mbox{\sf MT}$ , where MON is a specific simple subclass of TT ${}^{\ell}_{\downarrow}$ defined before [31, Lemma 27].

We note that by Lemma 10, $\mbox{\sf TT}\subseteq\mbox{\sf dTT$ {}{\mathrm{rel}} $}\circ\mbox{\sf TT$ {}^{\ell} $}$ and that it is easy to prove that $\mbox{\sf dTT$ {}{\mathrm{rel}} $}\circ\mbox{\sf TT$ {}^{\ell}{\downarrow} $}\subseteq\mbox{\sf TT$ {}{\downarrow} $}$ . Hence we even obtain that $\mbox{\sf TT}\subseteq\mbox{\sf TT$ {}_{\downarrow} $}\circ\mbox{\sf MT}$ . $\Box$

On the other hand, every mt can still be realized by a composition of two (finitary) tt’s.

Lemma 27

$\mbox{\sf MT}\subseteq\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dTT}\circ\mbox{\sf TT$ {}{\mathrm{pru}} $}\subseteq\mbox{\sf dTT$ {}_{\downarrow} $}\circ\mbox{\sf f\,TT}$ .

**Proof. **By [34, Theorem 6.10], $\mbox{\sf MT}=\mbox{\sf d$ {}{\mathrm{t}} $MT}\circ\mbox{\sf SET}$ , and by the proof of [34, Theorem 6.10], $\mbox{\sf SET}\subseteq\mbox{\sf TT$ {}^{\ell}{\mathrm{pru}} $}$ . Hence $\mbox{\sf MT}\subseteq\mbox{\sf d$ {}{\mathrm{t}} $MT}\circ\mbox{\sf TT$ {}^{\ell}{\mathrm{pru}} $}\subseteq\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dTT}\circ\mbox{\sf TT$ {}{\mathrm{pru}} $}$ by Lemma 24. That is included in $\mbox{\sf dTT$ {}_{\downarrow} $}\circ\mbox{\sf f\,TT}$ by Theorem 20. $\Box$

It can be shown that $\mbox{\sf f\,TT}\subseteq\mbox{\sf MT}=\mbox{\sf dTT$ {}_{\downarrow} $}\circ\mbox{\sf f\,TT}$ , thus generalizing Lemma 24 to the finitary case, but that will not be needed in what follows.

Finally, let MT ${}_{\text{{\sc io}}}$ denote the class of translations realized by mt’s with inside-out (io) derivation mode (see [34]), and let mrMT ${}_{\text{{\sc io}}}$ denote the class of translations realized by the multi-return macro tree transducers of [49, 50], which generalize io macro tree transducers.

Lemma 28

$\mbox{\sf MT}_{\text{{\sc io}}}\subseteq\mbox{\sf mrMT$ {}{\text{{\sc io}}} $}\subseteq\mbox{\sf f\,TT$ {}{\downarrow} $}\circ\mbox{\sf dTT}$ .

**Proof. **It is shown in [34, Lemma 5.5] that $\mbox{\sf MT$ {}{\text{{\sc io}}} $}\subseteq\mbox{\sf f\,TT$ {}^{\ell}{\downarrow} $}\circ\mbox{\sf YIELD}$ , and in [31, Lemma 36] that $\mbox{\sf YIELD}\subseteq\mbox{\sf dTT$ {}^{\ell} $}$ , and so $\mbox{\sf MT$ {}{\text{{\sc io}}} $}\subseteq\mbox{\sf f\,TT$ {}^{\ell}{\downarrow} $}\circ\mbox{\sf dTT}$ . It follows from [50, Lemma 4] that $\mbox{\sf mrMT$ {}{\text{{\sc io}}} $}\subseteq\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}\circ\mbox{\sf MT$ {}{\text{{\sc io}}} $}\circ\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}$ . Hence

[TABLE]

which is included in $\mbox{\sf f\,TT$ {}^{\hskip 1.13791pt\mathrm{s}}_{\downarrow} $}\circ\mbox{\sf dTT}$ by [20, Theorem 2.11(2)] and Theorem 18. $\Box$

6.2 MSO Tree Transducers

Let dMSOT denote the class of deterministic mso tree translations (see [14, Chapter 8], where it is denoted DMSOT, and where mso tree translations are called ms-transductions of terms). The next result is a variant of the main result of [7], which concerns attributed tree transducers with look-ahead instead of tt’s. In its present form it is proved in [14, Theorems 8.6 and 8.7].

Proposition 29

$\mbox{\sf dMSOT}=\mbox{\sf dTT$ {}_{\mathrm{su}} $}$ .

The next proposition is the main result of [32].

Proposition 30

$\mbox{\sf d$ {}_{\mathrm{t}} $MT}\cap\mbox{\sf LSIF}\subseteq\mbox{\sf dMSOT}$ .

This can be extended to arbitrary deterministic oi macro tree translations as follows.

Lemma 31

$\mbox{\sf dMT}\cap\mbox{\sf LSIF}\subseteq\mbox{\sf dMSOT}$ .

**Proof. **Since the domain $L$ of any mt $M$ is regular ([34, Theorem 7.4]), and dMT is closed under regular look-ahead ([34, Theorem 6.15]), there is a total mt $M^{\prime}$ that extends $M$ by the identity on the complement of $L$ . Clearly, $\tau_{M^{\prime}}$ is of linear size increase if and only if $\tau_{M}$ is. Hence, by Propositions 29 and 30, if $\tau_{M}$ is of linear size increase, then $\tau_{M^{\prime}}$ is in dTTsu. And so $\tau_{M}$ , which is the restriction of $\tau_{M^{\prime}}$ to the regular tree language $L$ , is also in dTTsu by Lemma 9. $\Box$

From Lemma 24, Lemma 31, Proposition 29, and Lemma 6 we obtain the following corollary.

Corollary 32

$\mbox{\sf dTT}\cap\mbox{\sf LSIF}=\mbox{\sf dTT$ {}_{\mathrm{su}} $}$ .

It is also shown in [32] that it is decidable for a total deterministic mt whether or not it is of linear size increase. That also holds for arbitrary deterministic mt’s by the proof of Lemma 31, and hence also for dtt’s by Lemma 24.

Corollary 33

It is decidable for a deterministic tt whether or not it is of linear size increase.

Note that since Corollary 32 is effective, if the dtt is indeed of linear size increase, then an equivalent ttsu can be constructed. One of our aims is to extend Corollaries 32 and 33 to arbitrary compositions of dtt’s.

7 Functional Nondeterminism

In this section we prove that for every nondeterministic top-down tt $M$ a deterministic top-down tt $M^{\prime}$ can be constructed that realizes a “uniformizer” of $\tau_{M}$ , i.e., a subset of $\tau_{M}$ with the same domain. This is a generalization of [21, Lemma], where it is proved for classical nondeterministic top-down tree transducers. Note that, as opposed to the deterministic case, the nondeterministic top-down tt is more powerful than the classical nondeterministic top-down tree transducer with regular look-ahead, because, due to the stay-instructions, it may not be finitary, i.e., it possibly translates one input tree into infinitely many output trees.

A uniformizer of a tree translation $\tau$ is a function $f$ such that $f\subseteq\tau$ and $\mathrm{dom}(f)=\mathrm{dom}(\tau)$ . Intuitively, $f$ selects for every input tree $t\in\mathrm{dom}(\tau)$ one of the elements of $\tau(t)$ .

Lemma 34

Every $\tau\in\mbox{\sf TT$ {}{\downarrow} $}$ has a uniformizer $\tau^{\prime}\in\mbox{\sf dTT$ {}{\downarrow} $}$ . If $\tau\in\mbox{\sf TT$ {}{\mathrm{pru}} $}$ , then $\tau^{\prime}\in\mbox{\sf dTT$ {}{\mathrm{pru}} $}$ .

**Proof. **Let $M=(\Sigma,\Delta,Q,Q_{0},R)$ be a nondeterministic tt↓. Without loss of generality we assume that $M$ has exactly one initial state $q_{0}$ , i.e., $Q_{0}=\{q_{0}\}$ . We have to construct a deterministic tt↓ $M^{\prime}$ that computes one possible output tree in $\tau_{M}(t)$ for every $t\in\mathrm{dom}(M)$ . The idea of the proof of [21, Lemma] is to pick, at the current node of $t$ , one of the rules that lead to the generation of an output tree (which can be checked by a regular test). However, that idea does not work here, because $M$ may have an infinite computation on $t$ (see [24, New Observation 5.10]). Thus, we have to be more careful. Note that an infinite computation is entirely due to the stay-instructions in the rules of $M$ .

The stay-instructions can be removed from $M$ by constructing the equivalent stay-free tt $M_{\mathrm{sf}}=(\Sigma,\Delta,Q,\{q_{0}\},R_{\mathrm{sf}})$ , with general rules, as we did at the end of Section 3. Recall that we assume that the regular tests in ${\cal T}_{M}$ are mutually disjoint, and that the set $R_{\mathrm{sf}}$ consists of all general rules $\langle q,\sigma,j,T\rangle\to\zeta$ such that $\zeta\in L(G_{q,\sigma,j,T})$ , for every left-hand side $\langle q,\sigma,j,T\rangle$ of a rule of $M$ . In this case $M_{\mathrm{sf}}$ is a top-down tt, with possibly infinitely many rules. Since its rules do not contain stay-instructions any more, it does not have infinite computations on the trees in its domain. Thus, the idea above can be applied to $M_{\mathrm{sf}}$ , which means that for every $q$ , $\sigma$ , $j$ , and $T$ we have to pick one general rule $\langle q,\sigma,j,T\rangle\to\zeta$ from $R_{\mathrm{sf}}$ , under the condition that its application leads to the generation of an output tree. This condition can be checked by a regular sub-test, as follows. Note that $\zeta\in T_{\Delta}(D_{\sigma})$ where $D_{\sigma}=\{\langle q^{\prime},{\rm down}_{i}\rangle\mid q^{\prime}\in Q,\,i\in[1,\operatorname{rank}_{\Sigma}(\sigma)]\}$ .

For every $\sigma\in\Sigma$ , $q^{\prime}\in Q$ , and $i\in[1,\operatorname{rank}(\sigma)]$ , let $T_{\sigma,q^{\prime},i}$ be the node test over $\Sigma$ consisting of all $(t,u)$ such that $u$ has label $\sigma$ in $t$ and there is a computation $\langle q^{\prime},ui\rangle\Rightarrow^{*}_{M,t}s$ for some $s\in T_{\Delta}$ . This node test is regular by Corollary 14 because $\operatorname{mark}(T_{\sigma,q^{\prime},i})$ is the domain of a tt $M_{q^{\prime},i}$ that on input $\operatorname{mark}(t,u)$ walks to the marked node $u$ , checks that its label is $\sigma$ , moves to the $i$ -th child of $u$ , and then simulates $M$ on $t$ , starting in state $q$ . For every $\sigma\in\Sigma$ and $D\subseteq D_{\sigma}$ , let $T_{\sigma,D}$ be the regular node test that is the intersection of all $T_{\sigma,q^{\prime},i}$ such that $\langle q^{\prime},{\rm down}_{i}\rangle\in D$ and all $T_{\Sigma}^{\bullet}\setminus T_{\sigma,q^{\prime},i}$ such that $\langle q^{\prime},{\rm down}_{i}\rangle\notin D$ . Obviously the node tests $T_{\sigma,D}$ are mutually disjoint.

We now define the deterministic tt↓ $M^{\prime}=(\Sigma,\Delta,Q,q_{0},R^{\prime})$ , where $R^{\prime}$ consists of the following general rules. For every left-hand side $\langle q,\sigma,j,T\rangle$ of a rule of $M$ and every $D\subseteq D_{\sigma}$ , if $L(G_{q,\sigma,j,T})\cap T_{\Delta}(D)\neq\varnothing$ , then $R^{\prime}$ contains the general rule $\langle q,\sigma,j,T\cap T_{\sigma,D}\rangle\to\zeta$ where $\zeta$ is a fixed element of $L(G_{q,\sigma,j,T})\cap T_{\Delta}(D)$ .

It should be clear that $M^{\prime}$ satisfies the requirements, i.e., it has the same domain as $M_{\mathrm{sf}}$ and it realizes a subset of $\tau_{M_{\mathrm{sf}}}$ . Note that $M^{\prime}$ can be constructed effectively, because $L(G_{q,\sigma,j,T})\cap T_{\Delta}(D)$ is a regular tree language, and hence its nonemptiness can be decided and, if so, an element can be computed. Finally, the general rules of $M^{\prime}$ can be replaced by ordinary rules, as discussed after Lemma 8. $\Box$

At the end of this section we prove that any function that is realized by a composition of nondeterministic tt’s can also be realized by a composition of deterministic tt’s. That will (only) be used to show that the results of Section 9 also hold for nondeterministic tt’s and mt’s. Let ${\cal F}$ be the class of all partial functions from trees to trees.

Theorem 35

For every $k\geq 1$ , $(\mbox{\sf TT$ {}{\downarrow} $}\circ\mbox{\sf TT}^{k})\cap{\cal F}\subseteq\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dTT}^{k}$ .

**Proof. **By Lemmas 26 and 27, $\mbox{\sf TT}\subseteq\mbox{\sf TT$ {}{\downarrow} $}\circ\mbox{\sf TT$ {}{\downarrow} $}\circ\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dTT}\circ\mbox{\sf TT$ {}{\downarrow} $}$ . Now let $\tau\in(\mbox{\sf TT$ {}{\downarrow} $}\circ\mbox{\sf TT}^{k})\cap{\cal F}$ . Then $\tau=\tau_{1}\circ\cdots\circ\tau_{m}$ where $m=5k+1$ , $\tau_{\,5j}\in\mbox{\sf dTT}$ for every $j\in[1,k]$ , and $\tau_{i}\in\mbox{\sf TT$ {}{\downarrow} $}$ for every $i\in[1,m]\setminus\{5j\mid j\in[1,k]\}$ . By Corollary 14, the domain of a translation in TT is regular. Hence, we may assume that $\mathrm{ran}(\tau_{i})\subseteq\mathrm{dom}(\tau_{i+1})$ for every $i\in[1,m-1]$ . If not, then we change $\tau_{i}$ into $\bar{\tau}_{i}$ for $i=m,\dots,1$ inductively as follows. First, $\bar{\tau}_{m}=\tau_{m}$ . Second, for $i<m$ we obtain $\bar{\tau}_{i}$ from $\tau_{i}$ by restricting its range to $\mathrm{dom}(\bar{\tau}_{i+1})$ , see Corollary 21 and the paragraph preceding it.

Since $\tau$ is a function, it should be clear that $\tau=\tau^{\prime}_{1}\circ\cdots\circ\tau^{\prime}_{m}$ where $\tau^{\prime}_{i}\in\mbox{\sf dTT$ {}{\downarrow} $}$ is the uniformizer of $\tau_{i}$ that exists by Lemma 34 if $\tau_{i}\in\mbox{\sf TT$ {}{\downarrow} $}$ , and $\tau^{\prime}_{i}=\tau_{i}$ if $\tau_{i}\in\mbox{\sf dTT}$ . Thus, $\tau\in\mbox{\sf dTT$ {}{\downarrow} $}\circ(\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dTT}\circ\mbox{\sf dTT$ {}{\downarrow} $})^{k}$ and so, by Theorem 18, $\tau\in\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dTT}^{k}$ . $\Box$

Corollary 36

For every $k\geq 1$ , $\mbox{\sf MT}^{k}\cap{\cal F}\subseteq\mbox{\sf dMT}^{k}$ .

**Proof. **By the same argument as in the proof of Theorem 35, using Lemma 27 only, we obtain that $\mbox{\sf MT}^{k}\cap{\cal F}\subseteq(\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf dTT}\circ\mbox{\sf dTT$ {}{\downarrow} $})^{k}$ . By Theorem 18 that is included in $\mbox{\sf dTT$ {}_{\downarrow} $}\circ\mbox{\sf dTT}^{k}$ , which equals $\mbox{\sf dMT}^{k}$ by Corollary 25. $\Box$

Since the inclusions in Corollary 25 are proper, as discussed after that corollary, Theorem 35 and Corollary 36 imply that $\mbox{\sf TT}^{k}$ and $\mbox{\sf MT}^{k}$ are also proper hierarchies, i.e., $\mbox{\sf TT}^{k}\subsetneq\mbox{\sf TT}^{k+1}$ and $\mbox{\sf MT}^{k}\subsetneq\mbox{\sf MT}^{k+1}$ for every $k\geq 1$ .

8 Productivity

In this section we prove that every tt can be decomposed into a pruning tt and another tt such that the composition is linear-bounded. It implies that we may always assume that a composition of two tt’s is linear-bounded. Recall from Section 2 that the composition of tree translations $\tau_{1}\subseteq T_{\Sigma}\times T_{\Delta}$ and $\tau_{2}\subseteq T_{\Delta}\times T_{\Gamma}$ is linear-bounded if there is a constant $c\in{\mathbb{N}}$ such that for every $(t,s)\in\tau_{1}\circ\tau_{2}$ there exists $r\in T_{\Delta}$ such that $(t,r)\in\tau_{1}$ , $(r,s)\in\tau_{2}$ , and $|r|\leq c\cdot|s|$ . Formally we say that the pair $(\tau_{1},\tau_{2})$ is linear-bounded. Recall also that for classes ${\cal T}_{1}$ and ${\cal T}_{2}$ of tree translations, the class ${\cal T}_{1}\ast{\cal T}_{2}$ consists of all translations $\tau_{1}\circ\tau_{2}$ such that $\tau_{1}\in{\cal T}_{1}$ , $\tau_{2}\in{\cal T}_{2}$ , and $(\tau_{1},\tau_{2})$ is linear-bounded. Two elementary properties of this class operation were stated in Lemma 1. We will prove the following theorem.

Theorem 37

$\mbox{\sf TT}\subseteq\mbox{\sf TT$ {}{\mathrm{pru}} $}\ast\mbox{\sf TT}$ * and $\mbox{\sf dTT}\subseteq\mbox{\sf dTT$ {}{\mathrm{pru}} $}\ast\mbox{\sf dTT}$ .*

Since pruning tt’s can be absorbed to the right by arbitrary tt’s (by Theorems 20 and 18), Theorem 37 can be generalized to compositions of tt’s. It implies that we may always assume that a composition of a tt with any number of tt’s is linear-bounded.

Corollary 38

Let $k\geq 1$ .

$(1)$

$\mbox{\sf TT}^{k}\subseteq\mbox{\sf TT$ {}_{\mathrm{pru}} $}\ast\mbox{\sf TT}^{k}$ * and $\mbox{\sf TT}\circ\mbox{\sf TT}^{k}=\mbox{\sf TT}\ast\mbox{\sf TT}^{k}$ , and* 2. $(2)$

$\mbox{\sf dTT}^{k}\subseteq\mbox{\sf dTT$ {}_{\mathrm{pru}} $}\ast\mbox{\sf dTT}^{k}$ * and $\mbox{\sf dTT}\circ\mbox{\sf dTT}^{k}=\mbox{\sf dTT}\ast\mbox{\sf dTT}^{k}$ .*

**Proof. **(1) The proof of the inclusion is by induction on $k$ . For $k=1$ it is Theorem 37. The induction step is proved as follows:

[TABLE]

where the first inclusion is by the induction hypothesis and the remaining inclusions are by Lemma 1, Theorem 20 (which says that $\mbox{\sf TT}\circ\mbox{\sf TT$ {}_{\mathrm{pru}} $}\subseteq\mbox{\sf TT}$ ), Theorem 37, and Lemma 1 again. The equation now follows from the inclusions above.

(2) The proof is exactly the same as in (1), using Theorem 18 instead of Theorem 20. $\Box$

The remainder of this section is devoted to the proof of Theorem 37. It is essentially a variant of the proof of [25, Lemma 4.1], which is the key lemma of [25] and concerns the removal of “superfluous computations” in attribute grammars. In its turn, that proof generalized the proof of [4, Lemma 1] where this was done for top-down tree transducers (and strangely enough, the author of [25] did not mention that).

To prove Theorem 37 it suffices, by Lemma 10, Lemma 1, and Theorems 20 and 18, to consider local tt’s, i.e., to prove that $\mbox{\sf TT$ {}^{\ell} $}\subseteq\mbox{\sf TT$ {}{\mathrm{pru}} $}\ast\mbox{\sf TT}$ and that $\mbox{\sf dTT$ {}^{\ell} $}\subseteq\mbox{\sf dTT$ {}{\mathrm{pru}} $}\ast\mbox{\sf dTT}$ . We prove the first and second inclusion in a first and second subsection, respectively. In the first subsection we additionally take care that the construction preserves the determinism of the given tt.

8.1 Nondeterministic Productivity

Let $M=(\Sigma,\Delta,Q,Q_{0},R)$ be a tt. For a pair $(t,s)\in\tau_{M}$ and a computation $\langle q_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M,t}s$ with $q_{0}\in Q_{0}$ , we say that a node $u$ of $t$ is productive (in that computation) if there is a $q\in Q$ such that an output rule is applied to the configuration $\langle q,u\rangle$ in the computation. Obviously, the size of $s$ is at least the number of productive nodes of $t$ . For $i\in\{0,1\}$ we define the computation to be $i$ -productive if all nodes of $t$ of rank $i$ are productive.101010Recall from Section 2 that the rank of a node is the rank of its label, i.e., the number of its children.

Moreover, the computation is productive if it is both [math]-productive and $1$ -productive, i.e., all leaves and monadic nodes of $t$ are productive. Finally, we define $\tau^{0}_{M}$ to consist of all $(t,s)\in\tau_{M}$ for which there is a 0-productive computation $\langle q_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M,t}s$ for some $q_{0}\in Q_{0}$ , and we define $\tau^{01}_{M}$ to consist of all $(t,s)\in\tau_{M}$ for which there is a productive computation of that form. Since the size of $t$ is at most twice the number of leaves plus the number of monadic nodes of $t$ ,111111To be precise, $|t|\leq(2\cdot|t|_{0}-1)+|t|_{1}$ where $|t|_{0}$ and $|t|_{1}$ are the number of leaves and monadic nodes of $t$ , respectively.

it follows that $|t|\leq 2\cdot|s|$ for every $(t,s)\in\tau^{01}_{M}$ .

To prove that $\mbox{\sf TT$ {}^{\ell} $}\subseteq\mbox{\sf TT$ {}_{\mathrm{pru}} $}\ast\mbox{\sf TT}$ , our goal is to construct, for a given ttℓ $M$ , a pruning tt $N$ and a ttℓ $M^{\prime}$ in such a way that $\tau_{N}\circ\tau_{M^{\prime}}\subseteq\tau_{M}$ and $\tau_{M}\subseteq\tau_{N}\circ\tau^{01}_{M^{\prime}}$ . This obviously implies that $\tau_{N}\circ\tau_{M^{\prime}}=\tau_{M}$ . The second inclusion says that for every $(t,s)\in\tau_{M}$ there exists a tree $t^{\prime}$ such that $(t,t^{\prime})\in\tau_{N}$ and $(t^{\prime},s)\in\tau^{01}_{M^{\prime}}$ . Thus, as observed above, $|t^{\prime}|\leq 2\cdot|s|$ , and hence $(\tau_{N},\tau_{M^{\prime}})$ is linear-bounded (for the constant $c=2$ ).

To this aim, $N$ will remove sufficiently many unproductive nodes from the input tree, and add state transition information of $M$ to the labels of the remaining nodes, thus allowing $M^{\prime}$ to simulate $M$ without having to visit those unproductive nodes. Since productivity of a node of the input tree $t$ depends on the computation of $M$ on $t$ , $N$ nondeterministically guesses which nodes to remove, and uses its regular tests to determine the possible behaviour of $M$ on the remaining nodes. To reduce the technical complexity of the proof, the construction of $N$ and $M^{\prime}$ will be done in two steps, removing unproductive leaves and monadic nodes in the first and second step, respectively.

Lemma 39

For every ttℓ $M$ there are a ttpru $N$ and a ttℓ $M^{\prime}$ such that

[TABLE]

If $M$ is deterministic, then so is $M^{\prime}$ .

Lemma 40

For every ttℓ $M$ there are a ttpru $N$ and a ttℓ $M^{\prime}$ such that

[TABLE]

If $M$ is deterministic, then so is $M^{\prime}$ .

It is easy to see that applying these lemmas one after the other, we have obtained the goal above; note that pruning tt’s are closed under composition by Theorem 20. It remains to prove the two lemmas. The constructions in their proofs are similar to the removal of $\varepsilon$ -rules and chain rules from a context-free grammar, respectively. As is well known, one should not remove these rules in the reverse order, because the removal of $\varepsilon$ -rules can create new chain rules. Similarly in our case, we should remove unproductive leaves and monadic nodes in that order, because the removal of unproductive leaves can create new unproductive monadic nodes. Note also that removing $\varepsilon$ -rules and chain rules in one construction is technically more complex.

Proof of Lemma 39. Let $M=(\Sigma,\Delta,Q,Q_{0},R)$ be a ttℓ. As discussed in the second paragraph after Proposition 7 (in Section 3), we may assume that the output rules of $M$ only use the stay-instruction. Let us consider $(t,s)\in\tau_{M}$ and a computation $\langle q_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M,t}s$ with $q_{0}\in Q_{0}$ . The idea of the construction of the ttpru $N$ and ttℓ $M^{\prime}$ is that $N$ (nondeterministically) preprocesses $t$ by removing the maximal subtrees of $t$ that consist of unproductive nodes only, and that $M^{\prime}$ simulates $M$ on the rest of $t$ . Let us say that a node $u$ of $t$ is superfluous (in this computation) if it is unproductive and all its descendants are unproductive. Note that the root of $t$ is not superfluous. Thus, $N$ changes $t$ into $t^{\prime}$ by pruning all superfluous nodes of $t$ . Moreover, it adds state transition information of $M$ to the labels of the remaining nodes to allow $M^{\prime}$ on $t^{\prime}$ to simulate the above computation of $M$ on $t$ . In the resulting computation of $M^{\prime}$ on $t^{\prime}$ , the input tree $t^{\prime}$ of $M^{\prime}$ has no superfluous nodes, which means in particular that all its leaves are productive. Note that, due to the removal of the superfluous nodes, each remaining node loses its superfluous children. Since the pruning tt $N$ does not know which nodes are going to be superfluous in $M$ ’s computation, it just nondeterministically removes subtrees of the input tree $t$ and adds to the label of each remaining node all possible state transitions of $M$ in computations on the removed subtrees that use move rules only. Whereas $N$ just guesses the superfluous nodes, it uses its regular tests to determine the state transitions of $M$ on those nodes.

As intermediate alphabet we use the ranked alphabet $\Gamma$ consisting of all symbols $\langle\sigma,(i_{1},\dots,i_{n}),\gamma\rangle$ such that $\sigma\in\Sigma$ , $n\in[0,\operatorname{rank}(\sigma)]$ , $1\leq i_{1}<i_{2}<\cdots<i_{n}\leq\operatorname{rank}(\sigma)$ , and $\gamma\subseteq Q\times Q$ . The rank of $\langle\sigma,(i_{1},\dots,i_{n}),\gamma\rangle$ is $n$ . In the case where $M$ is deterministic we require $\gamma$ to be a partial function from $Q$ to $Q$ . Intuitively, a node $u$ of $t$ with label $\sigma$ that is not removed by $N$ , will be relabeled by $\langle\sigma,(i_{1},\dots,i_{n}),\gamma\rangle$ such that the subtrees at its children $ui$ with $i\notin\{i_{1},\dots,i_{n}\}$ are removed by $N$ and $\gamma$ is the set of all $(q,\bar{q})$ such that $M$ has a computation from $\langle q,u\rangle$ to $\langle\bar{q},u\rangle$ (using move rules only) that visits one of the removed subtrees.

Formally, we define $N=(\Sigma,\Gamma,\{p\},\{p\},R_{N})$ with one state $p$ . For every symbol $\langle\sigma,(i_{1},\dots,i_{n}),\gamma\rangle$ in $\Gamma$ and every $j\in[0,{\mathit{m}x}_{\Sigma}]$ , it has the rule

[TABLE]

where $T$ is defined as follows. Let $t\in T_{\Sigma}$ and let $u\in{\cal N}(t)$ . The state transition relation $\gamma$ is uniquely determined by $(i_{1},\dots,i_{n})$ , and is expressed by $T$ . Let us say that a node $v\in{\cal N}(t)$ is a ghost if $v=uiw$ for some $i\notin\{i_{1},\dots,i_{n}\}$ and $w\in{\mathbb{N}}^{*}$ . Moreover, let us say that a computation

[TABLE]

$m\geq 3$ , is a ghost computation from $\langle q_{1},u_{1}\rangle$ to $\langle q_{m},u_{m}\rangle$ if $u_{j}$ is a ghost for every $j\in[2,m-1]$ . Note that such a computation is due to move rules only, that it visits at least one ghost, and that the ghosts $u_{2},\dots,u_{m-1}$ all belong to a subtree at the same child $ui$ . Finally, for states $q,\bar{q}\in Q$ we will write $q\hookrightarrow\bar{q}$ if there is a ghost computation from $\langle q,u\rangle$ to $\langle\bar{q},u\rangle$ . We now define $T$ to consist of all $(t,u)$ such that $\gamma=\{(q,\bar{q})\in Q\times Q\mid q\hookrightarrow\bar{q}\}$ . Note that $\gamma$ is indeed a partial function if $M$ is deterministic. The test $T$ is regular because it is a boolean combination of tests $T_{q,\bar{q}}=\{(t,u)\mid q\hookrightarrow\bar{q}\}$ , which are regular because the tree language $\{\operatorname{mark}(t,u)\mid q\hookrightarrow\bar{q}\}$ is regular for every $(q,\bar{q})\in Q\times Q$ by Corollary 14: it is the domain of a tt that first walks to $u$ , then simulates a ghost computation of $M$ on $t$ from $\langle q,u\rangle$ to $\langle\bar{q},u\rangle$ , and finally outputs a symbol of rank 0.

We define $M^{\prime}=(\Gamma,\Delta,Q,Q_{0},R^{\prime})$ with the following rules. Let $\rho:\langle q,\sigma,j\rangle\to\zeta$ be a rule in $R$ , and let $\langle\sigma,(i_{1},\dots,i_{n}),\gamma\rangle$ be an element of $\Gamma$ (with the same $\sigma$ ). If $\rho$ is an output rule or $\zeta=\langle q^{\prime},\alpha\rangle$ with $\alpha\in\{{\rm up},{\rm stay}\}$ , then $R^{\prime}$ contains the rule $\langle q,\langle\sigma,(i_{1},\dots,i_{n}),\gamma\rangle,j\rangle\to\zeta$ . If $\zeta=\langle q^{\prime},{\rm down}_{i_{k}}\rangle$ with $k\in[1,n]$ , then $R^{\prime}$ contains the rule $\langle q,\langle\sigma,(i_{1},\dots,i_{n}),\gamma\rangle,j\rangle\to\langle q^{\prime},{\rm down}_{k}\rangle$ . Otherwise (i.e., $\zeta=\langle q^{\prime},{\rm down}_{i}\rangle$ with $i\notin\{i_{1},\dots,i_{n}\}$ ), $R^{\prime}$ contains the rule $\langle q,\langle\sigma,(i_{1},\dots,i_{n}),\gamma\rangle,j\rangle\to\langle\bar{q},{\rm stay}\rangle$ for every $(q,\bar{q})\in\gamma$ . Note that if $M$ is deterministic, then so is $M^{\prime}$ .

It should be clear that $\tau_{N}\circ\tau_{M^{\prime}}\subseteq\tau_{M}$ , because for every $t^{\prime}\in\tau_{N}(t)$ the computations of $M^{\prime}$ on $t^{\prime}$ simulate computations of $M$ on $t$ .

To understand that $\tau_{M}\subseteq\tau_{N}\circ\tau^{0}_{M^{\prime}}$ , consider a computation $\langle q_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M,t}s$ with $q_{0}\in Q_{0}$ , and let $t^{\prime}\in\tau_{N}(t)$ be such that all superfluous nodes of $t$ (in this computation) are removed. Then it should be clear that the computation of $M$ on $t$ can be simulated by a computation $\langle q_{0},\mathrm{root}_{t^{\prime}}\rangle\Rightarrow^{*}_{M^{\prime},t^{\prime}}s$ of $M^{\prime}$ on $t^{\prime}$ . In fact, if $M$ visits a superfluous child of the current (non-superfluous) node $u$ of $t$ , then $M^{\prime}$ just stays in the node $v$ corresponding to $u$ in $t^{\prime}$ and changes its state to the one in which $M$ returns to $u$ . For a completely formal correctness proof one would have to formalize the obvious bijective correspondence $f$ between the non-superfluous nodes of $t$ and the nodes of $t^{\prime}$ . In fact, $f(\varepsilon)=\varepsilon$ , and if $u$ is non-superfluous and $ui_{1},\dots,ui_{n}$ are all the non-superfluous children of $u$ , then $f(ui_{k})=f(u)k$ for every $k\in[1,n]$ . Note that $u$ and $f(u)$ have the same child number. However, the correctness of the construction should be clear without such a proof. The configurations $\langle q,u\rangle$ of $M$ on $t$ , for every non-superfluous node $u$ , are simulated by the configurations $\langle q,f(u)\rangle$ of $M^{\prime}$ on $t^{\prime}$ . Finally, the above computation of $M^{\prime}$ on $t^{\prime}$ is 0-productive, because each leaf $f(u)$ of $t^{\prime}$ corresponds to a non-superfluous node $u$ of $t$ of which all descendants are superfluous, i.e., to a productive node. Since $M^{\prime}$ simulates $M$ , it follows that $f(u)$ is a productive node of $t^{\prime}$ . This ends the proof of Lemma 39.

Proof of Lemma 40. This proof is similar to the previous one. Let $M=(\Sigma,\Delta,Q,Q_{0},R)$ be a ttℓ. Again, we assume that the output rules of $M$ only use the stay-instruction. And again, let us consider $(t,s)\in\tau_{M}$ and a computation $\langle q_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M,t}s$ with $q_{0}\in Q_{0}$ . This time we define a node of $t$ to be superfluous if it is unproductive (in this computation) and has rank 1. As before, $N$ changes $t$ into $t^{\prime}$ by pruning all superfluous nodes of $t$ , and adds information to the labels of the remaining nodes to allow $M^{\prime}$ on $t^{\prime}$ to simulate the above computation of $M$ on $t$ . Whereas in the previous case, $M^{\prime}$ had to shortcut the subcomputations of $M$ on maximal subtrees of superfluous nodes, in the present case $M^{\prime}$ has to shortcut the subcomputations of $M$ on maximal sequences $u_{1},\dots,u_{n}$ of superfluous nodes ( $n\geq 1$ ), where $u_{i+1}$ is the unique child of $u_{i}$ for every $i\in[1,n-1]$ . For such a sequence, the unique child $u_{n+1}$ of $u_{n}$ is non-superfluous, and either $u_{1}$ is the root of $t$ , or the parent $u_{0}$ of $u_{1}$ is non-superfluous. In the second case, a subcomputation of $M$ on $u_{1},\dots,u_{n}$ is as follows. When it moves from $u_{0}$ down to $u_{1}$ , it either returns to $u_{0}$ , or it walks to $u_{n+1}$ . And when it moves from $u_{n+1}$ up to $u_{n}$ , it either returns to $u_{n+1}$ , or it walks to $u_{0}$ . In the first case, $M$ can only move from $u_{n+1}$ up to $u_{n}$ and return to $u_{n+1}$ . Thus, to the label of every non-superfluous node $u$ of $t$ we have to add information both on trips to superfluous nodes above $u$ and trips to superfluous nodes below $u$ . In the first case, $u_{n+1}$ will be the root of $t^{\prime}$ . In the second case, $u_{n+1}$ will be the $i$ -th child of $u_{0}$ in $t^{\prime}$ , where $i$ is the child number of $u_{1}$ in $t$ . Thus, the child number of $u_{n+1}$ changes from 1 to 0, or from 1 to $i$ , respectively.

As in the previous proof, the pruning tt $N$ does not know in advance which nodes are going to be superfluous in $M$ ’s computation. Thus, it just nondeterministically removes monadic nodes of the input tree $t$ and adds to the label of each remaining node all possible state transitions of $M$ in subcomputations on the removed nodes that use move rules only. Rather than constructing $N$ directly, it is more convenient to realize this pruning of $t$ by two consecutive pruning tt’s $N_{1}$ and $N_{2}$ , and use Theorem 20. The local relabeling tt $N_{1}$ nondeterministically marks monadic nodes of $t$ , by possibly changing the label $\sigma$ of a monadic node into $\widehat{\sigma}$ . The (deterministic) tt $N_{2}$ then removes the marked nodes, and relabels the unmarked nodes, adding the appropriate state transitions of $M$ (determined by regular tests). Since it is easy to construct $N_{1}$ , we only discuss $N_{2}$ .

The intermediate alphabet $\Gamma$ now consists of all symbols $\langle\sigma,j,U,\gamma\rangle$ such that $\sigma\in\Sigma$ , $j\in[0,{\mathit{m}x}_{\Sigma})]$ , $U\subseteq\{{\rm up}\}\cup\{{\rm down}_{i}\mid i\in[1,\operatorname{rank}(\sigma)]\}$ , and $\gamma\subseteq Q\times(Q\times I)$ , where $I$ is the set of all possible instructions. The rank of $\langle\sigma,j,U,\gamma\rangle$ is the rank of $\sigma$ . As before, in the case where $M$ is deterministic we require $\gamma$ to be a partial function from $Q$ to $Q\times I$ . Intuitively, a node $u$ of $t$ with label $\sigma$ that is not marked by $N_{1}$ , will be relabeled by $\langle\sigma,j,U,\gamma\rangle$ such that $j$ is its child number in $t$ , $\alpha\in U$ if and only if $\alpha(u)$ is marked by $N_{1}$ , and $\gamma$ is the set of all $(q,\langle\bar{q},\beta\rangle)$ such that the following holds: $M$ has a computation from $\langle q,u\rangle$ to $\langle\bar{q},\bar{u}\rangle$ (using move rules only) that visits a maximal sequence of marked nodes, for some unmarked node $\bar{u}$ such that $\beta(v)=\bar{v}$ , where $v$ and $\bar{v}$ are the nodes corresponding to $u$ and $\bar{u}$ in the tree $t^{\prime}$ .

We define $N_{2}=(\Sigma\cup\widehat{\Sigma},\Gamma,P,p_{0},R_{2})$ , where $\widehat{\Sigma}=\{\widehat{\sigma}\mid\sigma\in\Sigma^{(1)}\}$ , $P=\{p_{j}\mid j\in[0,{\mathit{m}x}_{\Sigma}]\}$ , and $R_{2}$ is defined as follows. For every $\sigma\in\Sigma^{(1)}$ and $j,j^{\prime}\in[0,{\mathit{m}x}_{\Sigma}]$ the transducer $N_{2}$ has the rule $\langle p_{j},\widehat{\sigma},j^{\prime}\rangle\to\langle p_{j},{\rm down}_{1}\rangle$ . Moreover, for every $\langle\sigma,j,U,\gamma\rangle\in\Gamma$ and $j^{\prime}\in[0,{\mathit{m}x}_{\Sigma}]$ it has the rule

[TABLE]

where $m=\operatorname{rank}(\sigma)$ and $T$ is defined as follows. Let $\hat{t}$ be a tree over $\Sigma\cup\widehat{\Sigma}$ and let $u\in{\cal N}(\hat{t}\,)$ . We define $\pi(\hat{t}\,)$ to be the tree over $\Sigma$ that is obtained from $\hat{t}$ by changing every label $\widehat{\sigma}$ into $\sigma$ . Both $U$ and $\gamma$ are uniquely determined, and they are expressed by $T$ . Let us say that a node $v\in{\cal N}(\hat{t}\,)$ is a ghost if its label is in $\widehat{\Sigma}$ . A ghost computation is defined as in the previous proof, for $t=\pi(\hat{t}\,)$ ; note that ${\cal N}(t)={\cal N}(\hat{t}\,)$ . And let us write $\langle q,u\rangle\hookrightarrow\langle\bar{q},\bar{u}\rangle$ if there is a ghost computation from $\langle q,u\rangle$ to $\langle\bar{q},\bar{u}\rangle$ . We now define $T$ to consist of all $(\hat{t},u)$ such that

•

${\rm up}\in U$ if and only $u$ has a parent and that parent is a ghost,

•

${\rm down}_{i}\in U$ if and only if $ui$ is a ghost,

•

$(q,\langle\bar{q},{\rm stay}\rangle)\in\gamma$ if and only if $\langle q,u\rangle\hookrightarrow\langle\bar{q},u\rangle$ ,

•

$(q,\langle\bar{q},{\rm up}\rangle)\in\gamma$ if and only if $\langle q,u\rangle\hookrightarrow\langle\bar{q},\bar{u}\rangle$ for some ancestor $\bar{u}$ of $u$ ,

•

$(q,\langle\bar{q},{\rm down}_{i}\rangle)\in\gamma$ if and only if $\langle q,u\rangle\hookrightarrow\langle\bar{q},\bar{u}\rangle$ for some descendant $\bar{u}$ of $ui$ .

As before, if $M$ is deterministic, then $\gamma$ is indeed a partial function. It is straightforward to prove, using Corollary 14, that $T$ is regular; we leave that to the reader.

We define $M^{\prime}=(\Gamma,\Delta,Q,Q_{0},R^{\prime})$ with the following rules. Let $\rho:\langle q,\sigma,j\rangle\to\zeta$ be a rule of $M$ , and let $\langle\sigma,j,U,\gamma\rangle$ be in $\Gamma$ (with the same $\sigma$ and $j$ ). If $\rho$ is an output rule or $\zeta=\langle q^{\prime},\alpha\rangle$ with $\alpha\notin U$ , then $R^{\prime}$ contains the rule $\langle q,\langle\sigma,j,U,\gamma\rangle,j^{\prime}\rangle\to\zeta$ for every $j^{\prime}\in[0,{\mathit{m}x}_{\Sigma}]$ (except $j^{\prime}=0$ when $\alpha={\rm up}$ ). If $\zeta=\langle q^{\prime},\alpha\rangle$ with $\alpha\in U$ , then $R^{\prime}$ contains the rule $\langle q,\langle\sigma,j,U,\gamma\rangle,j^{\prime}\rangle\to\langle\bar{q},\beta\rangle$ for every $(q,\langle\bar{q},\beta\rangle)\in\gamma$ and every $j^{\prime}\in[0,{\mathit{m}x}_{\Sigma}]$ (except $j^{\prime}=0$ when $\beta={\rm up}$ ).

Let $\tau=\tau_{N_{1}}\circ\tau_{N_{2}}$ . It should be clear that $\tau\circ\tau_{M^{\prime}}\subseteq\tau_{M}$ , as in the previous proof. To understand that $\tau^{0}_{M}\subseteq\tau\circ\tau^{01}_{M^{\prime}}$ , consider a 0-productive computation $\langle q_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M,t}s$ with $q_{0}\in Q_{0}$ , and let $t^{\prime}\in\tau(t)$ be obtained from $t$ by removing all superfluous nodes of $t$ . As in the previous proof, there is an obvious bijective correspondence $f$ between the non-superfluous nodes of $t$ and the nodes of $t^{\prime}$ . For a node $u$ of $t$ we define $g(u)=u$ if $u$ is non-superfluous, and $g(u)$ is the first (i.e., shortest) non-superfluous descendant of $u$ otherwise. Then $f(g(\varepsilon))=\varepsilon$ , and if $u$ is non-superfluous and $ui$ is a child of $u$ , then $f(g(ui))=f(u)i$ . And as before, there is a computation $\langle q_{0},\mathrm{root}_{t^{\prime}}\rangle\Rightarrow^{*}_{M^{\prime},t^{\prime}}s$ of $M^{\prime}$ on $t^{\prime}$ that simulates the computation of $M$ on $t$ , such that the configurations $\langle q,u\rangle$ of $M$ , for every non-superfluous node $u$ of $t$ , are simulated by the configurations $\langle q,f(u)\rangle$ of $M^{\prime}$ . Since $\tau$ does not remove leaves of $t$ , the computation of $M^{\prime}$ is still 0-productive. Moreover, it is also 1-productive because all unproductive monadic nodes were removed by $\tau$ . This ends the proof of Lemma 40.

Remark 41

In the Introduction we observed that our main technical result can be viewed as a static garbage collection procedure, which leads, in principle, to algorithms for automatic compiler and XML query optimization. For practical applicability our proof of this result is, however, of restricted value because the sizes of the involved transducers are blown up exponentially. This is due to the fact that, in the proof of Lemmas 39 and 40, the pruning tt $N$ uses regular tests to determine the relevant state transition information $\gamma\subseteq Q\times Q$ (or $\gamma\subseteq Q\times(Q\times I)$ ) of the given tt $M$ , due to its ghost computations. These regular tests are constructed through Corollary 14, applied to variants of $M$ . Naturally, the number of states of the finite-state tree automaton recognizing the domain of such a variant is exponential in the number $\#(Q)$ of states of $M$ , cf. the proof of [26, Lemma 1]. If one now considers the proof of $\mbox{\sf TT}\circ\mbox{\sf TT}\subseteq\mbox{\sf TT}\ast\mbox{\sf TT}$ in Corollary 38 (in which the pruning tt $N$ for the second tt $M$ is incorporated in the first tt by Theorem 20), it can be seen that the number of states of the first constructed tt is $2$ -fold exponential in the number of states of $M$ . The additional exponential jump is due to Lemma 12, which turns the pruning tt $N$ into one that is sub-testing. This implies that in the construction for the inclusion $\mbox{\sf TT}\circ\mbox{\sf TT}^{k}\subseteq\mbox{\sf TT}\ast\mbox{\sf TT}^{k}$ of Corollary 38, the size of the first constructed tt can be $2(k-1)$ -fold exponential in the size of the last given tt. This will also hold for the deterministic version. $\Box$

8.2 Deterministic Productivity

Let $M=(\Sigma,\Delta,Q,q_{0},R)$ be a deterministic tt. For $t\in\mathrm{dom}(M)$ we say that a node $u$ of $t$ is productive if it is productive in the computation $\langle q_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M,t}\tau_{M}(t)$ , and we say that $t$ is productive (for $M$ ) if that computation is productive, i.e., if all leaves and monadic nodes of $t$ are productive.121212There are several such computations, but they all have the same unique derivation tree in $L(G^{\mathrm{der}}_{M,t})$ . The definition of productivity clearly does not depend on the particular choice of the derivation.

We define $L_{M,\text{prod}}$ to be the set of all productive trees $t\in\mathrm{dom}(M)$ . Note that $\tau^{01}_{M}$ is the restriction of $\tau_{M}$ to $L_{M,\text{prod}}$ . The next lemma shows that the set of productive input trees is a regular tree language.

Lemma 42

Let $M=(\Sigma,\Delta,Q,q_{0},R)$ be a deterministic tt.

$(1)$

There is a regular test $T_{M,\mathrm{prod}}$ over $\Sigma$ such that for every $t\in\mathrm{dom}(M)$ and $u\in{\cal N}(t)$ , $(t,u)\in T_{M,\mathrm{prod}}$ if and only if $u$ is productive. 2. $(2)$

$L_{M,\text{prod}}$ * is a regular tree language over $\Sigma$ .*

**Proof. **(1) Let $M^{\prime}=(\Sigma\times\{0,1\},\{\top\},Q,\{q_{0}\},R^{\prime})$ be the nondeterministic tt such that $\top$ has rank 0, and $R^{\prime}$ is defined as follows. If $\langle q,\sigma,j,T\rangle\to\langle q^{\prime},\alpha\rangle$ is a move rule in $R$ , then $\langle q,(\sigma,b),j,\mu(T)\rangle\to\langle q^{\prime},\alpha\rangle$ is a rule in $R^{\prime}$ for every $b\in\{0,1\}$ . If $\langle q,\sigma,j,T\rangle\to\delta(\langle q_{1},\alpha_{1}\rangle,\dots,\langle q_{k},\alpha_{k}\rangle)$ is an output rule in $R$ , then $R^{\prime}$ contains the rules $\langle q,(\sigma,0),j,\mu(T)\rangle\to\langle q_{i},\alpha_{i}\rangle$ for every $i\in[k]$ and it also contains the rule $\langle q,(\sigma,1),j,\mu(T)\rangle\to\top$ . Intuitively, for an input tree $\operatorname{mark}(t,u)$ with $t\in\mathrm{dom}(M)$ , the tree-walking automaton $M^{\prime}$ follows an arbitrary path in the unique derivation tree $d\in L(G^{\mathrm{der}}_{M,t})$ , from the root of $d$ down to the leaves (cf. $M^{\prime}_{1}$ and $N$ in the proof of Lemma 22). Whenever $M$ branches at an unmarked node, $M^{\prime}$ nondeterministically follows one of those branches. It accepts $\operatorname{mark}(t,u)$ when an output rule is applied to the marked node $u$ . It should be clear that $T_{M,\mathrm{prod}}=\operatorname{mark}^{-1}(\mathrm{dom}(M^{\prime}))$ satisfies the requirements. It is regular by Corollary 14.

(2) Let $M^{\prime\prime}$ be a dtt that performs a depth-first left-to-right traversal of the input tree $t\in T_{\Sigma}$ and verifies that $(t,u)\in T_{M,\mathrm{prod}}$ for every leaf and monadic node $u$ of $t$ . Then $L_{M,\text{prod}}=\mathrm{dom}(M)\cap\mathrm{dom}(M^{\prime\prime})$ , which is regular by Corollary 14. $\Box$

For a given deterministic tt $M$ there are a nondeterministic pruning tt $N$ and a deterministic ttℓ $M^{\prime}$ such that $\tau_{N}\circ\tau_{M^{\prime}}=\tau_{M}$ and $\tau_{M}\subseteq\tau_{N}\circ\tau^{01}_{M^{\prime}}$ , by Lemmas 39 and 40. Our aim is to transform $N$ and $M^{\prime}$ in such a way that $N$ becomes deterministic. We basically do this by applying Lemma 34 to $\tau_{N}$ , replacing it by one of its uniformizers. But to preserve the above two properties we first restrict the domain of $M^{\prime}$ to productive input trees and then restrict the range of $N$ to the new domain, as follows.

By Lemma 42, the tree language $L_{M^{\prime},\text{prod}}$ is regular. Let $M^{\prime\prime}$ be the dtt that is obtained from $M^{\prime}$ by restricting its domain to $L_{M^{\prime},\text{prod}}$ , see Lemma 9. Hence, $\tau_{M^{\prime\prime}}=\tau^{01}_{M^{\prime}}$ and so $\tau_{N}\circ\tau_{M^{\prime\prime}}=\tau_{M}$ . Since $M^{\prime\prime}$ behaves in the same way as $M^{\prime}$ , every tree $t^{\prime}\in\mathrm{dom}(M^{\prime\prime})$ is productive (for $M^{\prime\prime}$ ). Next, we change $N$ into the nondeterministic pruning tt $N^{\prime}$ by restricting its range to $\mathrm{dom}(M^{\prime\prime})$ , by Corollary 21. Now $\tau_{N^{\prime}}\circ\tau_{M^{\prime\prime}}=\tau_{M}$ and $\mathrm{ran}(\tau_{N^{\prime}})\subseteq\mathrm{dom}(\tau_{M^{\prime\prime}})$ . Finally, we define $\tau\in\mbox{\sf dTT$ {}{\mathrm{pru}} $}$ to be the uniformizer of $\tau_{N^{\prime}}$ according to Lemma 34. Then $\tau\circ\tau_{M^{\prime\prime}}=\tau_{M}$ . Now consider $(t,s)\in\tau_{M}$ . Then $s=\tau_{M^{\prime\prime}}(r)$ for $r=\tau(t)$ . Since $r$ is productive for $M^{\prime\prime}$ , it follows that $|r|\leq 2\cdot|s|$ as observed at the end of the second paragraph of Section 8.1. Hence $(\tau,\tau_{M^{\prime\prime}})$ is linear-bounded, which shows that $\tau_{M}\in\mbox{\sf dTT$ {}{\mathrm{pru}} $}\ast\mbox{\sf dTT}$ .

9 Linear Size Increase

In this section we show our first main result: the hierarchy of tt’s collapses for functions of linear size increase.

Theorem 43

For every $k\geq 1$ , $\mbox{\sf dTT}^{k}\cap\mbox{\sf LSIF}=\mbox{\sf dTT$ {}_{\mathrm{su}} $}$ .

**Proof. **The proof is by induction on $k$ . For $k=1$ it is Corollary 32. To prove that $\mbox{\sf dTT}^{k+1}\cap\mbox{\sf LSIF}\subseteq\mbox{\sf dTT$ {}{\mathrm{su}} $}$ , let $\tau\in\mbox{\sf dTT}^{k}$ and let $M$ be a dtt such that $\tau_{M}\circ\tau\in\mbox{\sf LSIF}$ . By Corollary 38(2) we may assume that $(\tau_{M},\tau)$ is linear-bounded. Moreover, by restricting the domain of $M$ to $\mathrm{dom}(\tau_{M}\circ\tau)$ we may assume that $\mathrm{ran}(\tau_{M})\subseteq\mathrm{dom}(\tau)$ , see Lemma 9 and Corollary 14. Hence $\tau_{M}\in\mbox{\sf LSIF}$ by Lemma 2 and so $\tau_{M}\in\mbox{\sf dTT$ {}{\mathrm{su}} $}$ by Corollary 32. Then $\tau_{M}\circ\tau\in\mbox{\sf dTT}^{k}$ by Theorem 23. Hence $\tau_{M}\circ\tau\in\mbox{\sf dTT$ {}_{\mathrm{su}} $}$ by induction. $\Box$

Theorem 44

It is decidable for a composition of deterministic tt’s whether or not it is of linear size increase.

**Proof. **The proof is, again, by induction on $k$ , the number of dtt’s in the composition. It goes along the lines of the proof of Theorem 43, using Corollary 33 instead of Corollary 32 for the case $k=1$ . Assuming that we have an algorithm ${\cal A}_{k}$ for a composition of $k$ dtt’s, we construct ${\cal A}_{k+1}$ as follows. Let $M,M_{1},\dots,M_{k}$ be dtt’s, $k\geq 1$ , and let $\tau=\tau_{M_{1}}\circ\cdots\circ\tau_{M_{k}}$ . Since all our results are effective, we may assume as in the proof of Theorem 43 that $(\tau_{M},\tau)$ is linear-bounded and $\mathrm{ran}(\tau_{M})\subseteq\mathrm{dom}(\tau)$ . To decide whether or not $\tau_{M}\circ\tau$ is of linear size increase, we first decide whether or not $\tau_{M}$ is of linear size increase by Corollary 33. If not, then $\tau_{M}\circ\tau$ is not of linear size increase, by Lemma 2. If so, then a dtt $M^{\prime}_{1}$ that realizes $\tau_{M}\circ\tau_{M_{1}}$ can be constructed by Corollary 32 and Theorem 23, and we apply ${\cal A}_{k}$ to $M^{\prime}_{1},M_{2},\dots,M_{k}$ . $\Box$

Together with Lemma 24 and Proposition 29 in Section 6, Theorems 43 and 44 imply the following two corollaries on macro tree transducers.

Corollary 45

For every $k\geq 1$ , $\mbox{\sf dMT}^{k}\cap\mbox{\sf LSIF}=\mbox{\sf dMSOT}=\mbox{\sf dTT$ {}_{\mathrm{su}} $}\subseteq\mbox{\sf dMT}$ .

Corollary 46

It is decidable for a composition of deterministic mt’s whether or not it is of linear size increase.

For the class dMT ${}_{\text{{\sc io}}}$ of translations realized by deterministic macro tree transducers with inside-out (io) derivation mode, we obtain that $\mbox{\sf dMT}_{\text{{\sc io}}}^{k}\cap\mbox{\sf LSIF}\subseteq\mbox{\sf dTT$ {}_{\mathrm{su}} $}$ for every $k\geq 1$ , for the simple reason that dMT ${}_{\text{{\sc io}}}$ is a (proper) subclass of dMT by [34, Theorem 7.1(1)]. For the same reason Corollary 46 is also valid for those transducers. However, dTTsu is not included in dMT ${}_{\text{{\sc io}}}$ , because not every regular tree language is the domain of a deterministic io macro tree transducer (see [34, Corollary 5.6]).

Since $\mbox{\sf LSIF}\subseteq{\cal F}$ , it follows from Theorems 43 and 35 that Theorem 43 also holds for nondeterministic tt’s, i.e., $\mbox{\sf TT}^{k}\cap\mbox{\sf LSIF}=\mbox{\sf dTT$ {}_{\mathrm{su}} $}$ for every $k\geq 1$ .131313We do not know whether Theorem 44 holds for nondeterministic tt’s, i.e., whether it is decidable for a composition of nondeterministic tt’s whether or not it realizes a translation in LSIF.

Similarly, it follows from Corollaries 45 and 36 that Corollary 45 also holds for nondeterministic mt’s, i.e., $\mbox{\sf MT}^{k}\cap\mbox{\sf LSIF}=\mbox{\sf dMSOT}=\mbox{\sf dTT$ {}{\mathrm{su}} $}\subseteq\mbox{\sf dMT}$ for every $k\geq 1$ . This even holds for the so-called stay-macro tree transducers that can use stay-instructions, introduced in [31, Section 5.3], because it is shown in [31, Lemma 37] that the stay-macro tree translations are in $\mbox{\sf TT}^{4}$ . For the class $\mbox{\sf MT}_{\text{{\sc io}}}$ of nondeterministic io macro tree translations we also obtain that $\mbox{\sf MT}_{\text{{\sc io}}}^{k}\cap\mbox{\sf LSIF}\subseteq\mbox{\sf dTT$ {}{\mathrm{su}} $}$ for every $k\geq 1$ , because $\mbox{\sf MT}_{\text{{\sc io}}}\subseteq\mbox{\sf TT}^{2}$ by Lemma 28; the same is true for multi-return macro tree transducers.

The $k$ -pebble tree transducer was introduced in [63] as a model of XML document transformation. It is a tt that additionally can use $k$ distinct pebbles to drop on, and lift from, the nodes of the input tree. The life times of these pebbles must be nested. The tt is the 0-pebble tree transducer. It is shown in [31, Theorem 10] that every (deterministic) $k$ -pebble tree translation can be realized by a composition of (deterministic) $k+1$ tt’s. Hence Theorems 43 and 44 also hold for deterministic $k$ -pebble tree transducers, while Theorem 43 additionally holds for the nondeterministic case. In [28, Theorems 5 and 55] this is extended to $k$ -pebble tree transducers that, in addition to the $k$ distinct “visible” pebbles, can use an arbitrary number of “invisible” pebbles, still with nested life times: they can be realized by a composition of $k+2$ tt’s. Thus, Theorems 43 and 44 also hold for such transducers, cf. [28, Theorem 57].141414A “visible” pebble can be observed by the transducer during its entire life time (as usual for pebbles), whereas an “invisible” pebble $p$ cannot be observed during the life time of a pebble $p^{\prime}$ of which the life time is nested within the one of $p$ ; thus, such a pebble $p^{\prime}$ “hides” the pebble $p$ .

The high-level tree transducer was introduced in [35] as a generalization of both the top-down tree transducer and the macro tree transducer. It is proved in [35, Theorem 8.1(b)] that nondeterministic high-level tree transducers can be simulated by compositions of nondeterministic mt’s. Since every deterministic high-level tree transducer realizes a partial function (as should be clear from the proof of [35, Lemma 5.7]), it follows from Corollary 36 that, similarly, deterministic high-level tree transducers can be simulated by compositions of deterministic mt’s. Consequently, Corollaries 45 and 46 also hold for deterministic high-level tree transducers, and Corollary 45 additionally for the nondeterministic case.

10 Deterministic Complexity

Our first main complexity result says that a composition of deterministic tt’s can be computed by a RAM program in linear time, more precisely in time $O(n)$ where $n$ is the sum of the sizes of the input and the output tree.

Theorem 47

For every $k\geq 1$ and every $\tau\in\mbox{\sf dTT}^{k}$ there is an algorithm that computes, given an input $t$ , the output $s=\tau(t)$ in time $O(|t|+|s|)$ .

**Proof. **The proof is by induction on $k$ . We first prove the case $k=1$ , which is a slight generalization of the well-known fact for attribute grammars that the attribute evaluation of an input tree takes linear time (see, e.g., [17, 23]). Let $\tau\in\mbox{\sf dTT}$ and let $t$ be an input tree of $\tau$ . By Corollary 14, $\mathrm{dom}(\tau)$ is regular and hence can be recognized by a bottom-up finite-state tree automaton. Thus, we can decide whether or not $t\in\mathrm{dom}(\tau)$ in time $O(|t|)$ by running that automaton on $t$ . By Lemmas 10 and 12, $\tau=\tau_{1}\circ\tau_{2}$ with $\tau_{1}\in\mbox{\sf dTT$ {}^{\hskip 1.13791pt\mathrm{s}}_{\mathrm{rel}} $}$ and $\tau_{2}\in\mbox{\sf dTT$ {}^{\ell} $}$ . As observed in Section 3, $\tau_{1}$ can be realized by a classical linear deterministic top-down tree transducer with regular look-ahead. Thus, by (the proof of) [20, Theorem 2.6], it can be realized by a deterministic bottom-up finite-state relabeling (DBQREL) and a local relabeling tt. To run these two relabelings on $t\in\mathrm{dom}(\tau)$ obviously takes time $O(|t|)$ . Thus, it remains to consider the case that $\tau\in\mbox{\sf dTT$ {}^{\ell} $}$ . Let $M$ be a local dtt that realizes $\tau$ . To compute $\tau_{M}(t)$ , we first construct the regular tree grammar $G_{M,t}$ in time $O(|t|)$ , the number of configurations of $M$ on $t$ . Then we remove the chain rules from the context-free grammar $G_{M,t}$ , i.e., the rules $\langle q,u\rangle\to\langle q^{\prime},u^{\prime}\rangle$ resulting from the move rules of $M$ . Since $G_{M,t}$ is forward deterministic, this can also be done in time $O(|t|)$ , as follows. Viewing the chain rules as edges of a directed graph with configurations as nodes, we compute an evaluation order of the graph by topological sorting, in time $O(|t|)$ . Then we compute the new rules by traversing this order from right to left, again in time $O(|t|)$ . For an edge $\langle q,u\rangle\to\langle q^{\prime},u^{\prime}\rangle$ , if the (old or new) rule for $\langle q^{\prime},u^{\prime}\rangle$ is $\langle q^{\prime},u^{\prime}\rangle\to\delta(\langle q_{1},u_{1}\rangle,\dots,\langle q_{k},u_{k}\rangle)$ , then the new rule for $\langle q,u\rangle$ is $\langle q,u\rangle\to\delta(\langle q_{1},u_{1}\rangle,\dots,\langle q_{k},u_{k}\rangle)$ . Finally, we use this new regular tree grammar, equivalent to $G_{M,t}$ , to generate $s=\tau_{M}(t)$ , which takes time $O(|s|)$ because each rule generates a node of $s$ .

Now let $\tau=\tau_{1}\circ\tau_{2}$ such that $\tau_{1}\in\mbox{\sf dTT}$ and $\tau_{2}\in\mbox{\sf dTT}^{k}$ , $k\geq 1$ . By Corollary 38(2) we may assume that $(\tau_{1},\tau_{2})$ is linear-bounded. Let $t$ be an input tree of $\tau$ . Since $\mathrm{dom}(\tau)$ is regular by Corollary 14, we can check that $t\in\mathrm{dom}(\tau)$ in linear time, as above. By the case $k=1$ , the intermediate tree $r=\tau_{1}(t)$ can be computed in time $O(|t|+|r|)$ , and by induction the output tree $s=\tau(t)=\tau_{2}(r)$ can be computed in time $O(|r|+|s|)$ . Since $(\tau_{1},\tau_{2})$ is linear-bounded, there is a constant $c\in{\mathbb{N}}$ such that $|r|\leq c\cdot|s|$ , i.e., $|r|=O(|s|)$ . Hence the total time is $O(|t|+|r|)+O(|r|+|s|)=O(|t|+|s|)$ . $\Box$

It should be noted that the constant in the time complexity $O(|t|+|s|)$ can be large in terms of the size of the given transducers due to the use of linear-boundedness, cf. Remark 41.

Since deterministic macro tree transducers, pebble tree transducers, and high-level tree transducers can be realized as compositions of deterministic tt’s (see Section 9), Theorem 47 also holds for such transducers. For $k$ -pebble tree transducers this improves the result of [63, Proposition 3.5], where the time bound is $O(|t|^{k}+|s|)$ .

Before we proceed, we need an elementary lemma on leftmost derivations of context-free grammars. For a context-free grammar $G=(N,T,{\cal S},R)$ , a leftmost sentential form is a string $v\in(N\cup T)^{*}$ such that $S\Rightarrow^{*}_{G,\mathrm{lm}}v$ for some $S\in{\cal S}$ , where $\Rightarrow_{G,\mathrm{lm}}$ is the usual leftmost derivation relation of $G$ : if $X\to\zeta$ is in $R$ , then $v_{1}Xv_{2}\Rightarrow_{G,\mathrm{lm}}v_{1}\zeta v_{2}$ for all $v_{1}\in T^{*}$ and $v_{2}\in(N\cup T)^{*}$ .

Lemma 48

*Let $G=(N,T,{\cal S},R)$ be an $\varepsilon$ -free context-free grammar, and let $G^{\prime}=(N^{\prime},T,{\cal S},R^{\prime})$ be the equivalent context-free grammar such that $N^{\prime}=N\cup\{Z\}$ and $R^{\prime}=\{X\to\zeta Z\mid X\to\zeta\in R\}\cup\{Z\to\varepsilon\}$ , where $Z$ is a new nonterminal. Let $v$ be a leftmost sentential form of $G^{\prime}$ , and let $S\Rightarrow^{*}_{G^{\prime},\mathrm{lm}}v\Rightarrow^{*}_{G^{\prime},\mathrm{lm}}w$ be a leftmost derivation of $G^{\prime}$ with $S\in{\cal S}$ and $w\in L(G)$ . Moreover, let $d$ be the derivation tree corresponding to that derivation. Then the number of occurrences of $Z$ in $v$ is at most the height of $d$ .151515Note that there is a straightforward one-to-one correspondence between the leftmost derivations of $G$ and $G^{\prime}$ , and between their derivation trees. Since $G$ is $\varepsilon$ -free, the derivation trees have the same height. *

**Proof. **Each occurrence of a nonterminal $Y\in N^{\prime}$ in $v$ corresponds to a node of $d$ with label $Y$ in a well-known way. Let $u$ be the node of $d$ corresponding to the leftmost occurrence of $Z$ in $v$ . Clearly the number of occurrences of $Z$ in $v$ is equal to the number of edges on the path from $u$ to the root of $d$ . $\Box$

By [64, Theorem 2.5] it follows from Theorem 47 that a composition of deterministic tt’s can be computed by a deterministic Turing machine in cubic time, more precisely in time $O(n^{3})$ where $n$ is the sum of the sizes of the input and the output tree. Our second complexity result says that a composition of deterministic tt’s can be computed by a deterministic multi-tape Turing machine $N$ in linear space (in the sum of the sizes of the input and output tree). On a work tape of $N$ we will represent the input tree $t$ over $\Sigma$ by the string $\varphi(t)$ over $\Sigma\cup\{(,)\}$ , where $\{(,)\}$ is the set consisting of the left- and right-parenthesis, defined such that if $\varphi(t_{1})=t^{\prime}_{1},\dots,\varphi(t_{m})=t^{\prime}_{m}$ then $\varphi(\sigma t_{1}\cdots t_{m})=\sigma(t^{\prime}_{1}\cdots t^{\prime}_{m})$ . In other words, we formally insert the parentheses (but not the commas) that are always used informally to denote trees. The parentheses allow $N$ to walk on the tree $t$ , from node to node, because it can recognize a subtree of $t$ by checking that the numbers of left- and right-parentheses in the corresponding substring of $\varphi(t)$ are equal. In particular, it can determine the child number of a node of $t$ by counting the number of its younger siblings. Obviously, the mapping $\varphi$ is injective, and can be computed in linear space (simulating a one-way push-down transducer). In what follows we identify $t$ and $\varphi(t)$ .

Theorem 49

For every $k\geq 1$ and every $\tau\in\mbox{\sf dTT}^{k}$ there is a deterministic Turing machine that computes, given an input $t$ , the output $s=\tau(t)$ in space $O(|t|+|s|)$ .

**Proof. **Again, we first show this for $k=1$ . Let $M=(\Sigma,\Delta,Q,q_{0},R)$ be a dtt, and let $t\in T_{\Sigma}$ be an input tree. As usual we assume that the output rules of $M$ only contain stay-instructions. We describe a deterministic multi-tape Turing machine $N$ that computes $\tau_{M}$ in linear space. By Corollary 14, $\mathrm{dom}(M)$ is a regular tree language and hence a context-free language, which can be recognized in deterministic linear space. Thus, $N$ starts by deciding whether or not $t\in\mathrm{dom}(M)$ . Now assume that $t\in\mathrm{dom}(M)$ . To compute $s=\tau_{M}(t)$ , the machine $N$ simulates the (unique) leftmost derivation of the forward deterministic context-free grammar $G_{M,t}$ . Every leftmost sentential form of $G_{M,t}$ is of the form $w\langle q_{1},u_{1}\rangle\cdots\langle q_{n},u_{n}\rangle$ with $w\in\Delta^{*}$ and $\langle q_{i},u_{i}\rangle\in\operatorname{Con}(t)$ . If one views the states of $M$ as recursive procedures with one parameter of type ‘node of $t$ ’, then $\langle q_{1},u_{1}\rangle\cdots\langle q_{n},u_{n}\rangle$ corresponds to the contents of the stack in the usual implementation of recursive procedures: each configuration $\langle q_{i},u_{i}\rangle$ is a call of procedure $q_{i}$ with actual parameter $u_{i}$ . The machine $N$ uses a one-way output tape on which it prints $w$ (which will finally be $s$ ), a work tape with the input tree $t$ (or rather $\varphi(t)$ ), and a work tape that contains a stack representing $\langle q_{1},u_{1}\rangle\cdots\langle q_{n},u_{n}\rangle$ , with the top of the stack to the left. At each moment of time, a reading head of $N$ is at node $u_{1}$ of $t$ , and another reading head is at the top of the stack. Note that $n\leq|s|$ because every configuration $\langle q_{i},u_{i}\rangle$ will generate at least one symbol of $s$ . If $N$ would represent the parameters $u_{2},\dots,u_{n}$ by their Dewey notation, the size of the stack could be $|s|\cdot|t|$ , which is too much. Thus, we need a more compact representation of the nodes $u_{2},\dots,u_{n}$ . In a rule of $G_{M,t}$ with left-hand side $\langle q,u\rangle$ , every node $u^{\prime}$ in the right-hand side is a neighbour of $u$ , or $u$ itself, and so, the “difference” between $u$ and $u^{\prime}$ can be expressed by an instruction in $I=\{{\rm up},{\rm stay}\}\cup\{{\rm down}_{i}\mid i\in[1,{\mathit{m}x}_{\Sigma}]\}$ . This allows us to represent $\langle q_{1},u_{1}\rangle\cdots\langle q_{n},u_{n}\rangle$ by the node $u_{1}$ and a stack of the form $q_{1}\gamma_{1}q_{2}\gamma_{2}\cdots q_{n}\gamma_{n}$ where $\gamma_{i}\in I^{*}$ is a sequence of instructions that leads from $u_{i}$ to $u_{i+1}$ (with $u_{n+1}=\mathrm{root}_{t}$ ). Let us now consider in detail how $N$ simulates the leftmost derivation of $G_{M,t}$ .

At each moment of time, the current node of $t$ and the current contents of the output tape and the stack tape represent a leftmost sentential form of $G_{M,t}$ , which is an element of $\Delta^{*}\cdot\operatorname{Con}(t)^{*}$ . The stack tape contains a string in $(Q\cup I)^{*}\bot$ , where $\bot$ is the bottom stack symbol and $I$ is as above. The current node $u$ of $t$ and the current contents $w\in\Delta^{*}$ and $\xi\in(Q\cup I)^{*}\bot$ of the output tape and stack tape, respectively, represent the leftmost sentential form $w\cdot\mu(u,\xi)$ , where the string $\mu(u,\xi)\in\operatorname{Con}(t)^{*}$ is defined as follows (for every $q\in Q$ and $\beta\in I$ ): $\mu(u,q\xi)=\langle q,u\rangle\cdot\mu(u,\xi)$ , $\mu(u,\beta\xi)=\mu(\beta(u),\xi)$ , and $\mu(\bot)=\varepsilon$ . Initially, $N$ starts at the root of $t$ , with empty output tape and with stack tape $q_{0}\bot$ , representing the initial output form $\langle q_{0},\mathrm{root}_{t}\rangle$ . If the top symbol of the stack is $\bot$ , then $N$ halts. Otherwise, to compute the next leftmost sentential form, $N$ first pops the top symbol off the stack. If that symbol was $q\in Q$ , and the current node $u$ of $t$ has label $\sigma$ and child number $j$ , then $N$ selects the unique rule $\langle q,\sigma,j,T\rangle\to\zeta$ that is applicable to $\langle q,u\rangle$ . Note that it can test in linear space whether or not $(t,u)\in T$ , because $\operatorname{mark}(T)$ is a context-free language. If $\zeta=\langle q^{\prime},\alpha\rangle$ , then $N$ moves to node $\alpha(u)$ of $t$ and pushes the string $q^{\prime}\beta$ on the stack where $\beta$ is defined as follows: if $\alpha$ is ${\rm up}$ , ${\rm stay}$ , or ${\rm down}_{i}$ , then $\beta$ is ${\rm down}_{j}$ , ${\rm stay}$ , or ${\rm up}$ , respectively. If $\zeta=\delta(\langle q_{1},{\rm stay}\rangle,\dots,\langle q_{k},{\rm stay}\rangle)$ , then $N$ outputs $\delta$ , and pushes $q_{1}\cdots q_{k}$ on the stack (if $k>0$ ). It is easy to check that in both these cases the resulting configuration of $N$ represents the next leftmost sentential form of $G_{M,t}$ . If the top symbol of the stack was $\beta\in I$ , the machine $N$ moves to node $\beta(u)$ of $t$ . This does not change the represented leftmost sentential form. Thus, after applying a rule $\langle q,\sigma,j,T\rangle\to\delta$ (with $\delta$ of rank [math]), $N$ removes instructions from the stack (and moves its reading head on $t$ accordingly) until the top of the stack is a state again. When $N$ halts, the output tape contains $s$ .

It remains to show that the length of the stack is linear in $|t|+|s|$ . As mentioned above, since every configuration $\langle q,u\rangle$ will generate at least one symbol of $s$ , the number of occurrences of states in the stack is at most $|s|$ . To estimate the number of occurrences of instructions in the stack, we use Lemma 48. In the above case where $q$ is the top stack symbol and $\langle q,\sigma,j,T\rangle\to\langle q^{\prime},\alpha\rangle$ is the rule applicable to $\langle q,u\rangle$ , the machine $N$ does not apply the rule $\langle q,u\rangle\to\langle q^{\prime},\alpha(u)\rangle$ of $G_{M,t}$ , but rather the rule $\langle q,u\rangle\to\langle q^{\prime},\alpha(u)\rangle\beta$ where $\beta$ is defined as above. Moreover, when $\beta$ is the top stack symbol, $N$ applies the rule $\beta\to\varepsilon$ . From this it should be clear that, by Lemma 48 and footnote 15, the number of occurrences of instructions in the stack is at most the height of the derivation tree corresponding to the derivation $\langle q_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M,t}s$ of $G_{M,t}$ . As observed in Section 2 after Lemma 3, that height is at most $\#(\operatorname{Con}(t))$ , i.e., $\#(Q)\cdot|t|$ . Thus, the length of the stack is indeed $O(|s|+|t|)$ .

The induction step can be proved in exactly the same way as in the proof of Theorem 47, with ‘time’ replaced by ‘space’. $\Box$

For a class ${\cal T}$ of tree translations and a class ${\cal L}$ of tree languages, we denote by ${\cal T}({\cal L})$ the class of tree languages $\tau(L)$ with $\tau\in{\cal T}$ and $L\in{\cal L}$ . The elements of ${\cal T}(\mbox{\sf REGT})$ are called the output tree languages (or surface languages) of ${\cal T}$ . Since $\mbox{\sf dTT}\subseteq\mbox{\sf dMT}$ by Lemma 24, it follows from the proof of [34, Theorem 7.5] that the output tree languages of $\mbox{\sf dTT}^{k}$ are recursive. From Theorem 49 we now obtain that they are in $\mbox{\sf DSPACE}(n)$ , i.e., can be recognized by a Turing machine in deterministic linear space. This was shown for classical top-down tree transducers in [4].

Theorem 50

For every $k\geq 1$ , $\mbox{\sf dTT}^{k}(\mbox{\sf REGT})\subseteq\mbox{\sf DSPACE}(n)$ .

**Proof. **Let $L\in\mbox{\sf REGT}$ and $\tau\in\mbox{\sf dTT}^{k}$ . By Corollary 38(2), $\tau=\tau_{1}\circ\tau_{2}$ such that $\tau_{1}\in\mbox{\sf dTT$ {}_{\mathrm{pru}} $}$ , $\tau_{2}\in\mbox{\sf dTT}^{k}$ , and $(\tau_{1},\tau_{2})$ is linear-bounded for some constant $c$ . Let $L^{\prime}=\tau_{1}(L)$ , and note that $\tau(L)=\tau_{2}(L^{\prime})$ and that $L^{\prime}\in\mbox{\sf REGT}$ by Lemma 15. It is straightforward to show that for every $s\in\tau(L)$ there exists $t\in L^{\prime}$ such that $(t,s)\in\tau_{2}$ and $|t|\leq c\cdot|s|$ . To check whether a given tree $s$ is in $\tau(L)$ , a deterministic Turing machine systematically enumerates all input trees $t$ (of $\tau_{2}$ ) such that $|t|\leq c\cdot|s|$ . For each such $t$ it first checks that $t\in L^{\prime}$ in space $O(|t|)$ . Then it uses the algorithm of Theorem 49 to compute $\tau_{2}(t)$ in space $c^{\prime}\cdot(|t|+|\tau_{2}(t)|)$ , but rejects $t$ as soon as the computation takes more than space $c^{\prime}\cdot(|t|+|s|)$ ; thus, the space used is $O(|t|+|s|)=O(|s|)$ . Clearly, $s\in\tau(L)$ if and only if $\tau_{2}(t)=s$ for some such $t$ . $\Box$

For a tree $t$ we denote its yield by $yt$ , for a tree language $L$ we define $yL=\{yt\mid t\in L\}$ , and for a class ${\cal L}$ of tree languages we define $y{\cal L}=\{yL\mid L\in{\cal L}\}$ . For a class ${\cal T}$ of tree translations, the languages in $y{\cal T}(\mbox{\sf REGT})$ are called the output string languages (or target languages) of ${\cal T}$ .

Corollary 51

For every $k\geq 1$ , $y\mbox{\sf dTT}^{k}(\mbox{\sf REGT})\subseteq\mbox{\sf DSPACE}(n)$ .

**Proof. **For an alphabet $\Delta$ , let $\Gamma=\Delta\cup\{e\}$ be the ranked alphabet such that $e$ has rank 0 and every element of $\Delta$ has rank 1. For a string $w$ over $\Delta$ we define $\operatorname{mon}(w)=we\in T_{\Gamma}$ . It is easy to see that for every ranked alphabet $\Sigma$ there is a dttℓ $M$ such that $\tau_{M}(t)=\operatorname{mon}(yt)$ . From this and Theorem 50 the result follows. $\Box$

We observe here, for $k=1$ , that $\mbox{\sf dTT}(\mbox{\sf REGT})$ and $y\mbox{\sf dTT}(\mbox{\sf REGT})$ are included in LOGCFL, the class of languages that are log-space reducible to a context-free language. This will be proved in Corollaries 64 and 65. Note that $\mbox{\sf LOGCFL}\subseteq\mbox{\sf DSPACE}(\log^{2}n)$ .

We also observe that Theorem 50 and Corollary 51 also hold for nondeterministic tt’s, as will be proved in Theorem 67 (and was proved for classical top-down tree transducers in [4]).

As before, Theorems 49 and 50 and Corollary 51 also hold for deterministic macro tree transducers, pebble tree transducers, and high-level tree transducers. It is proved in [30, Theorem 23] that composition of deterministic mt’s yields a proper hierarchy of output string languages (called the $y\mbox{\sf dMT}$ -hierarchy), i.e., that $y\mbox{\sf dMT}^{k}(\mbox{\sf REGT})\subsetneq y\mbox{\sf dMT}^{k+1}(\mbox{\sf REGT})$ for every $k\geq 1$ . The io-hierarchy consists of the classes of string languages $\mbox{\sf IO}(k)$ generated by level- $k$ grammars, with the inside-out (io) derivation mode (see, e.g., [16]). By [33, Theorem 7.5] the io-hierarchy can be defined as output string languages of tree transformations: $\mbox{\sf IO}(k)=y\mbox{\sf YIELD}^{k}(\mbox{\sf REGT})$ . Since $\mbox{\sf YIELD}\subseteq\mbox{\sf dTT}$ by [31, Lemma 36], we obtain that $\mbox{\sf IO}(k)\subseteq y\mbox{\sf dTT}^{k}(\mbox{\sf REGT})$ . Thus, the next corollary is immediate from Corollary 51. Note that it was already proved in [37, Theorem 3.3.8] that the io languages (i.e., the languages in $\mbox{\sf IO}(1)$ ) are in $\mbox{\sf NSPACE}(n)$ ; in [3] this was improved to LOGCFL. It was proved in [16, Corollary 8.12] that the languages in the io-hierarchy are recursive.

Corollary 52

For every $k\geq 1$ , $\mbox{\sf IO}(k)\subseteq\mbox{\sf DSPACE}(n)$ .

Note that by [30, Theorem 36] the EDTOL control hierarchy is included in the io-hierarchy.

By Corollary 25, $y\mbox{\sf dTT}^{k}(\mbox{\sf REGT})\subseteq y\mbox{\sf dMT}^{k}(\mbox{\sf REGT})\subseteq y\mbox{\sf dTT}^{k+1}(\mbox{\sf REGT})$ . It is proved in [30, Theorem 32] that there exists a language in $\mbox{\sf IO}(k+1)$ that is not in $y\mbox{\sf dMT}^{k}(\mbox{\sf REGT})$ . Since $\mbox{\sf IO}(k+1)\subseteq y\mbox{\sf dTT}^{k+1}(\mbox{\sf REGT})$ , that implies the following stronger version of Proposition 7.

Corollary 53

For every $k\geq 1$ , $y\mbox{\sf dTT}^{k}(\mbox{\sf REGT})\subsetneq y\mbox{\sf dTT}^{k+1}(\mbox{\sf REGT})$ .

11 Nondeterministic Complexity

We now turn to the complexity of compositions of nondeterministic tt’s. We first consider the case where all the transducers in the composition are finitary. The next lemma shows that Theorem 37 and Corollary 38 also hold for f TT.

Lemma 54

$\mbox{\sf f\,TT}^{k}\subseteq\mbox{\sf TT$ {}_{\mathrm{pru}} $}\ast\mbox{\sf f\,TT}^{k}$ * and $\mbox{\sf f\,TT}\circ\mbox{\sf f\,TT}^{k}=\mbox{\sf f\,TT}\ast\mbox{\sf f\,TT}^{k}$ for every $k\geq 1$ .*

**Proof. **To show that $\mbox{\sf f\,TT}\subseteq\mbox{\sf TT$ {}{\mathrm{pru}} $}\ast\mbox{\sf f\,TT}$ , let $\tau\in\mbox{\sf f\,TT}$ . By Theorem 37, $\tau=\tau_{1}\circ\tau_{2}$ such that $\tau_{1}\in\mbox{\sf TT$ {}{\mathrm{pru}} $}$ , $\tau_{2}\in\mbox{\sf TT}$ , and $(\tau_{1},\tau_{2})$ is linear-bounded. Since $\mathrm{ran}(\tau_{1})\in\mbox{\sf REGT}$ by Lemma 15, we may assume that $\mathrm{dom}(\tau_{2})\subseteq\mathrm{ran}(\tau_{1})$ by Lemma 9. Then $\tau_{2}$ is finitary too.

Theorem 20 implies that $\mbox{\sf f\,TT}\circ\mbox{\sf TT$ {}_{\mathrm{pru}} $}\subseteq\mbox{\sf f\,TT}$ , because the composition of two finitary translations is finitary. The remainder of the proof is now entirely similar to the one of Corollary 38. $\Box$

We will prove that a composition of tt’s can be computed by a nondeterministic Turing machine in linear space and polynomial time (in the sum of the sizes of the input and output tree), which generalizes Theorem 49. In the next lemma we consider the case where all tt’s are finitary.

Lemma 55

For every $k\geq 1$ and every $\tau\in\mbox{\sf f\,TT}^{k}$ there is a nondeterministic Turing machine that computes, given an input $t$ , any output $s\in\tau(t)$ in space $O(|t|+|s|)$ and in time polynomial in $|t|+|s|$ .

**Proof. **For the case $k=1$ the proof is exactly the same as that of Theorem 49 except, of course, that the Turing machine $N$ nondeterministically simulates any leftmost derivation of $G_{M,t}$ , selecting nondeterministically a rule of $M$ to compute a next leftmost sentential form. It follows from Lemmas 48 and 3 that the number $n$ of occurrences of instruction symbols in the stack is $O(|t|)$ . In fact, since $M$ is finitary, it suffices by Lemma 3 to simulate leftmost derivations of $G_{M,t}$ for which the corresponding derivation tree in $L(G^{\mathrm{der}}_{M,t})$ has height at most $\#(Q)\cdot|t|$ . As in the proof of Theorem 49, Lemma 48 implies that $n$ is at most that height, i.e., at most $\#(Q)\cdot|t|$ . Thus, $N$ works in space $O(|t|+|s|)$ . Moreover, it works in time $O(|t|^{2}\cdot|s|)$ , because the size of such a derivation tree (and hence the length of the leftmost derivation) is at most $\#(Q)\cdot|t|\cdot|s|$ , and each step in the leftmost derivation takes time $O(|t|)$ . Note that regular tree languages (which are context-free languages) can be recognized in nondeterministic linear time.

Now let $\tau=\tau_{1}\circ\tau_{2}$ such that $\tau_{1}\in\mbox{\sf f\,TT}$ and $\tau_{2}\in\mbox{\sf f\,TT}^{k}$ , $k\geq 1$ . We may assume by Lemma 54 that $(\tau_{1},\tau_{2})$ is linear-bounded. So, there is a constant $c\in{\mathbb{N}}$ such that for every $(t,s)\in\tau$ there exists a tree $r$ such that $(t,r)\in\tau_{1}$ , $(r,s)\in\tau_{2}$ , and $|r|\leq c\cdot|s|$ . By the case $k=1$ , the intermediate tree $r$ can be computed from $t$ in nondeterministic space $O(|t|+|r|)$ , and by induction, the output tree $s$ can be computed from $r$ in nondeterministic space $O(|r|+|s|)$ . Hence, since $|r|=O(|s|)$ , $s$ can be computed from $t$ in nondeterministic space $O(|t|+|s|)$ . The time is polynomial in $|t|+|r|$ and $|r|+|s|$ , and hence polynomial in $|t|+|s|$ . $\Box$

By Lemma 27, $\mbox{\sf MT}\subseteq\mbox{\sf f\,TT}^{2}$ . Consequently Lemma 55 also holds for every $\tau\in\mbox{\sf MT}^{k}$ .

We now turn to the output languages of $\mbox{\sf f\,TT}^{k}$ . By $\mbox{\sf NSPACE}(n)\wedge\mbox{\sf NPTIME}$ we will denote the class of languages that can be recognized by a nondeterministic Turing machine in simultaneous linear space and polynomial time. Trivially, $\mbox{\sf NSPACE}(n)\wedge\mbox{\sf NPTIME}$ is included in both $\mbox{\sf NSPACE}(n)$ and NPTIME.

Lemma 56

For every $k\geq 1$ , $\mbox{\sf f\,TT}^{k}(\mbox{\sf REGT})\subseteq\mbox{\sf NSPACE}(n)\wedge\mbox{\sf NPTIME}$ .

**Proof. **The proof is similar to the one of Theorem 50. Let $L\in\mbox{\sf REGT}$ and $\tau\in\mbox{\sf f\,TT}^{k}$ . By Lemma 54, $\tau=\tau_{1}\circ\tau_{2}$ where $\tau_{1}\in\mbox{\sf TT$ {}_{\mathrm{pru}} $}$ , $\tau_{2}\in\mbox{\sf f\,TT}^{k}$ , and $(\tau_{1},\tau_{2})$ is linear-bounded for some constant $c$ . Let $L^{\prime}=\tau_{1}(L)$ . Then $s\in\tau(L)$ if and only if there exists $t\in L^{\prime}$ such that $(t,s)\in\tau_{2}$ and $|t|\leq c\cdot|s|$ . To check whether a given tree $s$ is in $\tau(L)$ , a nondeterministic Turing machine guesses an input tree $t$ such that $|t|\leq c\cdot|s|$ , it checks that $t\in L^{\prime}$ in time and space $O(|t|)$ (because $L^{\prime}$ is a context-free language), and then computes any $s^{\prime}\in\tau(t)$ with $|s^{\prime}|\leq|s|$ in space $O(|t|+|s^{\prime}|)$ and time polynomial in $|t|+|s^{\prime}|$ , by Lemma 55. Finally it checks that $s^{\prime}=s$ in time and space $O(|s|)$ . Thus the space used is $O(|t|+|s|)=O(|s|)$ , and the time is polynomial in $|s|$ . $\Box$

Although mt’s are finitary, whereas tt’s need not be finitary, it is proved in [31, Theorem 38 and Corollary 39] that compositions of mt’s have the same output languages as compositions of (local) tt’s. This implies that Lemmas 56 and 55 also hold for $\mbox{\sf TT}^{k}$ .

Theorem 57

For every $k\geq 1$ , $\mbox{\sf TT}^{k}(\mbox{\sf REGT})\subseteq\mbox{\sf NSPACE}(n)\wedge\mbox{\sf NPTIME}$ , and moreover, $\mbox{\sf MT}^{k}(\mbox{\sf REGT})\subseteq\mbox{\sf NSPACE}(n)\wedge\mbox{\sf NPTIME}$ .

**Proof. **By Lemma 27, $\mbox{\sf MT}\subseteq\mbox{\sf f\,TT}^{2}$ . Thus, by Lemma 56, $\mbox{\sf MT}^{k}(\mbox{\sf REGT})\subseteq\mbox{\sf NSPACE}(n)\wedge\mbox{\sf NPTIME}$ . From Lemma 10 and Theorem 20 it follows (by induction on $k$ ) that $\mbox{\sf TT}^{k}\subseteq\mbox{\sf dTT$ {}_{\mathrm{rel}} $}\circ(\mbox{\sf TT$ {}^{\ell} $})^{k}$ and hence $\mbox{\sf TT}^{k}(\mbox{\sf REGT})\subseteq(\mbox{\sf TT$ {}^{\ell} $})^{k}(\mbox{\sf REGT})$ by Lemma 15. Finally, by [31, Theorem 38 and Corollary 39], $(\mbox{\sf TT$ {}^{\ell} $})^{k}(\mbox{\sf REGT})\subseteq\mbox{\sf MT}^{m}(\mbox{\sf REGT})$ for some $m\geq 1$ . Hence $\mbox{\sf TT}^{k}(\mbox{\sf REGT})\subseteq\mbox{\sf NSPACE}(n)\wedge\mbox{\sf NPTIME}$ , by the above. $\Box$

As observed already after Corollary 51, the space part of Theorem 57 will be strengthened to $\mbox{\sf DSPACE}(n)$ in Theorem 67.

Theorem 58

For every $k\geq 1$ and every $\tau\in\mbox{\sf TT}^{k}$ there is a nondeterministic Turing machine that computes, given an input $t$ , any output $s\in\tau(t)$ in space $O(|t|+|s|)$ and in time polynomial in $|t|+|s|$ . The same holds for $\tau\in\mbox{\sf MT}^{k}$ .

**Proof. **For $\tau\in\mbox{\sf MT}^{k}$ this was already observed after Lemma 55. Now let $\tau\in\mbox{\sf TT}^{k}$ with input alphabet $\Sigma$ . Let $\bar{\Sigma}=\{\bar{\sigma}\mid\sigma\in\Sigma\}$ with $\operatorname{rank}(\bar{\sigma})=\operatorname{rank}(\sigma)$ be a set of new symbols, and let $\bar{t}\in T_{\bar{\Sigma}}$ be obtained from $t\in T_{\Sigma}$ by changing each label $\sigma$ into $\bar{\sigma}$ . Finally, let $\#$ be a new symbol of rank 2. It is easy to show that the tree language $L_{\tau}=\{\#(\bar{t},s)\mid(t,s)\in\tau\}$ is in $\mbox{\sf TT}^{k}(\mbox{\sf REGT})$ : the first transducer additionally copies the input to the output (with bars), and each other transducer copies the first subtree of the input to the output. By Theorem 57, there is a nondeterministic Turing machine $N$ that recognizes $L_{\tau}$ in linear space and polynomial time. We construct the nondeterministic Turing machine $N^{\prime}$ that, on input $t$ , guesses a possible output tree $s$ , writing $\#(\bar{t},s)$ on a worktape, uses $N$ as a subroutine to verify that $(t,s)\in\tau$ , and outputs $s$ . Clearly, $N^{\prime}$ satisfies the requirements. $\Box$

Since io (multi-return) macro tree translations, pebble tree translations, and high-level tree translations can be realized by compositions of tt’s (see Section 9), Theorems 57 and 58 also hold for those translations.

By the proof of Corollary 51, we additionally obtain from Theorem 57 that $y\mbox{\sf TT}^{k}(\mbox{\sf REGT})\subseteq\mbox{\sf NSPACE}(n)\wedge\mbox{\sf NPTIME}$ for every $k\geq 1$ , and the same is true for $y\mbox{\sf MT}^{k}(\mbox{\sf REGT})$ . The oi-hierarchy consists of the classes of string languages $\mbox{\sf OI}(k)$ generated by level- $k$ grammars, with the outside-in (oi) derivation mode (see, e.g., [16, 33]). It was shown in [37, Theorem 4.2.8] that $\mbox{\sf OI}(1)$ equals the class of indexed languages of [1], and hence that $\mbox{\sf OI}(1)\subseteq\mbox{\sf NSPACE}(n)$ by [1, Theorem 5.1]. Moreover, it was shown in [67, Proposition 2] that $\mbox{\sf OI}(1)\subseteq\mbox{\sf NPTIME}$ . In [16, Corollary 7.26] it was proved that the languages in the oi-hierarchy are recursive. As observed in the last paragraph of [35], $\mbox{\sf OI}(k)$ is included in $y\mbox{\sf MT}^{m}(\mbox{\sf REGT})$ for some $m$ .

Corollary 59

For every $k\geq 1$ , $\mbox{\sf OI}(k)\subseteq\mbox{\sf NSPACE}(n)\wedge\mbox{\sf NPTIME}$ .

It is shown in [67] that there is an NP-complete language in both $\mbox{\sf OI}(1)$ and $y\mbox{\sf f\,TT$ {}{\downarrow} $}(\mbox{\sf REGT})$ , and it is shown in [74] that there even is one in the class ETOL, which is a subclass of both $\mbox{\sf OI}(1)$ and $y\mbox{\sf f\,TT$ {}{\downarrow} $}(\mbox{\sf REGT})$ . Note that by [75, Theorem 14] the ETOL control hierarchy is included in the oi-hierarchy.

It will be shown in Corollary 68 that $\mbox{\sf OI}(k)\subseteq\mbox{\sf DSPACE}(n)$ .

12 Translation Complexity

In this section we study the time and space complexity of the membership problem of the tree translations in $\mbox{\sf TT}^{k}$ , i.e., for a fixed tree translation $\tau\subseteq T_{\Sigma}\times T_{\Delta}$ we want to know, for given trees $t\in T_{\Sigma}$ and $s\in T_{\Delta}$ , how hard it is to decide whether or not $(t,s)\in\tau$ . To formalize this, we denote by $L_{\tau}$ the string language $\{\#ts\mid(t,s)\in\tau\}$ , where $\#$ is a new symbol. For simplicity, and without loss of generality, we assume that $\Sigma\cap\Delta=\varnothing$ . Otherwise, we replace $\Sigma$ by $\bar{\Sigma}=\{\bar{\sigma}\mid\sigma\in\Sigma\}$ as in the proof of Theorem 58. So, $L_{\tau}$ is a tree language over $\Sigma\cup\Delta\cup\{\#\}$ , where $\#$ has rank 2. For a class ${\cal T}$ of tree translations and a complexity class ${\cal C}$ , we will write ${\cal T}\subseteq{\cal C}$ to mean that $L_{\tau}\in{\cal C}$ for every $\tau\in{\cal T}$ . As usual, we denote the class of languages that are accepted by a deterministic Turing machine in polynomial time by PTIME, and the class of languages that are log-space reducible to a context-free language by LOGCFL. Note that every regular tree language is a context-free language and hence is in LOGCFL. Note also that $\mbox{\sf LOGCFL}\subseteq\mbox{\sf PTIME}$ (see [68]) and $\mbox{\sf LOGCFL}\subseteq\mbox{\sf DSPACE}(\log^{2}n)$ (see [57, 68] and [46, Theorem 12.7.4]).

If $\tau\in\mbox{\sf dTT}^{k}$ then, on input $\#(t,s)$ , we can compute $\tau(t)$ according to Theorems 47 and 49 (rejecting the input when the computation takes more than time or space $c\cdot(|t|+|s|)$ for the given constant $c$ ) and then verify that $\tau(t)=s$ , cf. the proof of Theorem 50. Thus, $L_{\tau}$ can be accepted by a RAM program in linear time and by a deterministic Turing machine in linear space. This means that $\mbox{\sf dTT}^{k}\subseteq\mbox{\sf PTIME}$ and $\mbox{\sf dTT}^{k}\subseteq\mbox{\sf DSPACE}(n)$ . If $\tau\in\mbox{\sf TT}^{k}$ , then, as mentioned in the proof of Theorem 58, the tree language $L_{\tau}$ is in the class of output languages $\mbox{\sf TT}^{k}(\mbox{\sf REGT})$ , and hence in $\mbox{\sf NSPACE}(n)\wedge\mbox{\sf NPTIME}$ by Theorem 57. This means that $\mbox{\sf TT}^{k}\subseteq\mbox{\sf NPTIME}$ and $\mbox{\sf TT}^{k}\subseteq\mbox{\sf NSPACE}(n)$ . Due to the presence of both the input tree and the output tree in $L_{\tau}$ , one would expect that better upper bounds can be shown. Indeed, we will prove that $\mbox{\sf TT}^{k}\subseteq\mbox{\sf DSPACE}(n)$ .

Our main aim in this section is to prove that $\mbox{\sf TT}\circ\mbox{\sf dTT}\subseteq\mbox{\sf LOGCFL}$ . We follow the approach of [25], using multi-head automata.

A multi-head tree-walking tree transducer $M=(\Sigma,\Delta,Q,Q_{0},R)$ (in short, mhtt) is defined in the same way as a tt, but has an arbitrary, fixed number of reading heads. Each of these heads can walk on the input tree, independent of the other heads. It can test the label and child number of the node that it is currently reading, and additionally apply a regular test to that node. Moreover, we assume that the heads are “sensing”, which means that $M$ can test which heads are currently scanning the same node. Thus, if $M$ has $\ell$ heads, then its move rules are of the form

[TABLE]

where $E\subseteq[1,\ell]\times[1,\ell]$ is an equivalence relation. A configuration of $M$ on input tree $t$ is of the form $\langle q,u_{1},\dots,u_{\ell}\rangle$ , to which the rule is applicable if $M$ is in state $q$ , each $u_{i}$ satisfies the tests $\sigma_{i}$ , $j_{i}$ , and $T_{i}$ , and $u_{i}=u_{j}$ for every $(i,j)\in E$ . After application the new configuration is $\langle q^{\prime},\alpha_{1}(u_{1}),\dots,\alpha_{\ell}(u_{\ell})\rangle$ . The output rules are defined in a similar way. Initially all reading heads are at the root of the input tree. This is all similar to how multi-head automata on strings are defined.

We will use the mhtt $M$ as an acceptor of its domain. We will say that it accepts $\mathrm{dom}(M)$ in polynomial time if there is a polynomial $p(n)$ such that for every $t\in\mathrm{dom}(M)$ there is a computation $\langle q_{0},\mathrm{root}_{t}\rangle\Rightarrow^{*}_{M,t}s$ of length at most $p(|t|)$ for some $q_{0}\in Q_{0}$ and $s\in T_{\Delta}$ . Note that we consider nondeterministic mhtt’s only.

Lemma 60

For every multi-head tt $M$ , $\mathrm{dom}(M)\in\mbox{\sf PTIME}$ . Moreover, if $M$ accepts $\mathrm{dom}(M)$ in polynomial time, then $\mathrm{dom}(M)\in\mbox{\sf LOGCFL}$ .

**Proof. **After this paragraph we will show that the domain of a multi-head tt can be accepted by an alternating multi-head finite automaton (in short, amfa), in a straightforward way. Moreover, we will show that if the mhtt accepts in polynomial time, then the corresponding amfa accepts in polynomial tree-size. That proves the lemma because PTIME is the class of languages accepted by amfa’s (see [10, 12]) and LOGCFL is the class of languages accepted by amfa’s in polynomial tree-size (see [68, 71]).

It is well known that the domain of a classical local tt can be accepted by an alternating (one-head) tree-walking automaton, see, e.g., [70], [24, Section 4], and [63, Section 4], and the same is true for the multi-head case. Let $M=(\Sigma,\Delta,Q,Q_{0},R)$ be an mhtt. The amfa $M^{\prime}$ that accepts $\mathrm{dom}(M)$ simulates $M$ on the input $t\in T_{\Sigma}$ , without producing output. The reading heads of $M$ are simulated by reading heads of $M^{\prime}$ in the obvious way. Every (initial) state $q$ of $M$ is simulated by the existential (initial) state $q$ of $M^{\prime}$ , and a move rule of $M$ is simulated by a transition of $M^{\prime}$ in an obvious way. If $M$ applies an output rule in state $q$ , then $M^{\prime}$ first goes into a universal state $q^{\prime}$ and then branches in the same way as $M$ , going into existential states. A regular test $T$ of $M$ is simulated by $M^{\prime}$ in a side branch, using an amfa subroutine that accepts the context-free language $\operatorname{mark}(T)$ , with additional reading heads. Note that since the heads are sensing, the node to be tested is “marked” by a reading head. Similarly, to move a head $h$ from a parent $u$ to its $i$ -th child $ui$ , $M^{\prime}$ first moves an auxiliary head $h^{\prime}$ nondeterministically to a position to the right of $u$ , then checks in a side branch that the string between $h$ and $h^{\prime}$ belongs to the context-free language $T_{\Sigma}^{i-1}$ , and finally moves $h$ to $h^{\prime}$ . In a similar way $M^{\prime}$ can move from $ui$ to $u$ , and can determine the child number of $u$ .

If $M$ accepts $t$ in time $m$ , then the size of the corresponding computation tree of $M^{\prime}$ is polynomial in $m$ , because each computation step of $M$ takes polynomial tree-size. Thus, if $M$ accepts in polynomial time, then $M^{\prime}$ accepts in polynomial tree-size.

Note that if we assume that the simulation of a step of $M$ takes constant tree-size, and we assume moreover that $M$ only uses output rules (by eventually replacing the right-hand side $\zeta$ of each move rule by $\delta(\zeta)$ , where $\delta$ has rank 1), then the output tree of $M$ can be viewed both as the derivation tree of the computation of $M$ and as the computation tree of $M^{\prime}$ , roughly speaking. $\Box$

Thus, to prove that $\mbox{\sf TT}\circ\mbox{\sf dTT}\subseteq\mbox{\sf LOGCFL}$ it suffices to show, for every $\tau=\tau_{1}\circ\tau_{2}$ with $\tau_{1}\in\mbox{\sf TT}$ and $\tau_{2}\in\mbox{\sf dTT}$ , that $L_{\tau}$ can be accepted by a multi-head tt $M$ in polynomial time. Let $M_{1}$ and $M_{2}$ be tt’s that realize $\tau_{1}$ and $\tau_{2}$ . For an input tree $t$ and an output tree $s$ of $\tau$ , $M$ will simulate $M_{1}$ on $t$ , generating an intermediate tree $r$ , and verify that $M_{2}$ translates $r$ into $s$ . Since $M$ cannot store its output tree $r$ , it must verify the translation of $r$ into $s$ on the fly, i.e., while generating $r$ . That can be done because the context-free grammar $G_{M_{2},r}$ is forward deterministic, and hence its reduced version has a unique fixed point: during the generation of the nodes $v$ of $r$ , $M$ can guess the values of the nonterminals $\langle q,v\rangle$ of $G_{M_{2},r}$ (which are subtrees of $s$ ) and check the fixed point equations for them. However, since $G_{M_{2},r}$ need not be reduced, we have to be more careful.

Let $G=(N,\Delta,\{S\},R)$ be a forward deterministic context-free grammar, and let $\#$ be a symbol not in $N\cup\Delta$ (which stands for ‘undefined’). A string homomorphism $h:N\to\Delta^{*}\cup\{\#\}$ is a fixed point of $G$ if (1) $h(S)\neq\#$ , (2) $h(X)$ is a substring of $h(S)$ for every $X\in N$ such that $h(X)\neq\#$ , and (3) $h(X)=h(\zeta)$ for every rule $X\to\zeta$ in $R$ such that $h(X)\neq\#$ , where $h$ is extended to $\Delta$ by defining $h(a)=a$ for every $a\in\Delta$ . In the special case that $G$ is a regular tree grammar, a tree fixed point of $G$ is a fixed point $h$ of $G$ such that $h(X)\in T_{\Delta}\cup\{\#\}$ for every $X\in N$ and $h(X)$ is a subtree of $h(S)$ for every $X\in N$ such that $h(X)\neq\#$ .

Lemma 61

Let $G=(N,\Delta,\{S\},R)$ be a forward deterministic context-free grammar such that $L(G)\neq\varnothing$ . For every $w\in\Delta^{*}$ , $L(G)=\{w\}$ if and only if there is a fixed point $h$ of $G$ such that $h(S)=w$ . If $G$ is a regular tree grammar, then the same statement holds for $w\in T_{\Delta}$ and $h$ a tree fixed point.

**Proof. **Let $L(G)=\{w\}$ , and define $h_{G}(X)$ to be the unique string generated by $X$ , if that exists and is a substring of $w$ , and otherwise $h_{G}(X)=\#$ . It is easy to see that $h=h_{G}$ satisfies the requirements.

Let $h$ be a fixed point of $G$ such that $h(S)=w$ . Then $h(v)=w$ for every sentential form $v$ of $G$ . Since $L(G)\neq\varnothing$ , this shows that $L(G)=\{w\}$ . $\Box$

Theorem 62

$\mbox{\sf TT}\circ\mbox{\sf dTT}\subseteq\mbox{\sf LOGCFL}$ .

**Proof. **Let $M_{1}=(\Sigma,\Omega,P,P_{0},R_{1})$ be a tt, and let $M_{2}=(\Omega,\Delta,Q,q_{0},R_{2})$ be a dtt. We will denote $\tau_{M_{1}}$ and $\tau_{M_{2}}$ by $\tau_{1}$ and $\tau_{2}$ , respectively. Since it is easy to prove (as in the proof of Corollary 38) that $\mbox{\sf TT}\circ\mbox{\sf dTT}=\mbox{\sf TT}\ast\mbox{\sf dTT}$ , we may assume that $(\tau_{1},\tau_{2})$ is linear-bounded. We may also assume, by Lemma 10 and Theorem 20, that $M_{2}$ is local. That does not change the linear-boundedness of the composition: if $(\tau_{1},\tau^{\prime}_{2}\circ\tau^{\prime\prime}_{2})$ is linear-bounded and $\tau^{\prime}_{2}\in\mbox{\sf TT$ {}_{\mathrm{rel}} $}$ , then $(\tau_{1}\circ\tau^{\prime}_{2},\tau^{\prime\prime}_{2})$ is linear-bounded because $\tau^{\prime}_{2}$ is size-preserving. Similarly, we may assume that $\mathrm{ran}(\tau_{1})\subseteq\mathrm{dom}(\tau_{2})$ by Corollaries 14 and 21. Finally we assume (as in the proofs of Lemmas 17 and 19) that $M_{1}$ keeps track in its finite state of the child number of the output node to be generated, through a mapping $\chi:P\to[0,{\mathit{m}x}_{\Sigma}]$ .

On the basis of Lemma 60, we will describe a multi-head tt $M$ that accepts $L_{\tau}$ in polynomial time, where $\tau=\tau_{1}\circ\tau_{2}$ . Initially $M$ verifies by a regular test that the input tree is of the form $\#(t,s)$ with $t\in T_{\Sigma}$ and $s\in T_{\Delta}$ . We will denote the root of $\#(t,s)$ by its label $\#$ . As mentioned before, on input $\#(t,s)$ the transducer $M$ simulates $M_{1}$ on $t$ generating an output tree $r$ of $M_{1}$ , which is in the domain of $M_{2}$ because $\mathrm{ran}(\tau_{1})\subseteq\mathrm{dom}(\tau_{2})$ . It keeps the state $p$ of $M_{1}$ in its finite state, uses one of its heads to point at a node of $t$ (which it initially moves to the root of $t$ ), and instead of a regular test $T$ applies the regular test $\{(\#(t,s),1u)\mid(t,u)\in T\}$ .161616Note that a node of $t$ has the same label and child number in $t$ and $\#(t,s)$ , except when it has child number 1 in $\#(t,s)$ in which case it has child number [math] or $1$ in $t$ , depending on whether or not its parent in $\#(t,s)$ has label $\#$ .

While generating $r$ it guesses a tree fixed point $h:\operatorname{Con}(r)\to T_{\Delta}\cup\{\#\}$ of the regular tree grammar $G_{M_{2},r}$ such that $h(\langle q_{0},\mathrm{root}_{r}\rangle)=s$ . If that fixed point can be guessed, then $\tau_{2}(r)=s$ by Lemma 61, and hence $(t,s)\in\tau$ .

Initially, $M$ guesses the values under $h$ of the configurations in $\operatorname{Con}(r)$ that contain the root of $r$ , in linear time. For each $q\in Q$ the value of $\langle q,\mathrm{root}_{r}\rangle$ is guessed by nondeterministically moving a reading head named $(q,{\rm stay})$ to a node $x$ of $s$ , i.e., node $2x$ of $\#(t,s)$ , meaning that $h(\langle q,\mathrm{root}_{r}\rangle)=s|_{x}$ , or to node $\#$ , meaning that $h(\langle q,\mathrm{root}_{r}\rangle)=\#$ (i.e., that $h(\langle q,\mathrm{root}_{r}\rangle)$ is “undefined”). In particular, the head $(q_{0},{\rm stay})$ is moved to the root of $s$ , thus guessing that $\tau_{2}(r)=s$ .

Suppose that $M$ is going to produce a node $v$ of $r$ with label $\omega$ , by simulating an output rule $\langle p,\sigma,j,T\rangle\to\omega(\langle p_{1},\alpha_{1}\rangle,\dots,\langle p_{k},\alpha_{k}\rangle)$ of $M_{1}$ . In such a situation, $M$ has already guessed the values under $h$ of the configurations in $\operatorname{Con}(r)$ that contain $v$ , and also of those that contain the parent $v^{\prime}$ of $v$ (if it has one). For each $q\in Q$ the value of $\langle q,v\rangle$ is stored using the reading head named $(q,{\rm stay})$ , as explained above for $v=\mathrm{root}_{r}$ , and the value of $\langle q,v^{\prime}\rangle$ is stored in a similar way using a reading head named $(q,{\rm up})$ . Now $M$ guesses the values of the configurations that contain the children of $v$ , in linear time. For every $q\in Q$ and $i\in[1,k]$ , the value $h(\langle q,vi\rangle)$ is guessed by nondeterministically moving a reading head named $(q,{\rm down}_{i})$ to some node of $s$ or to $\#$ . Then $M$ checks that these values satisfy requirement (3) of a fixed point of $G_{M_{2},r}$ as follows, in linear time. If $\langle q,\omega,\chi(p)\rangle\to\langle q^{\prime},\alpha\rangle$ is a move rule of $M_{2}$ such that head $(q,{\rm stay})$ does not point to $\#$ , then $M$ checks that the heads $(q,{\rm stay})$ and $(q^{\prime},\alpha)$ point to nodes with the same subtree. It can do this using two auxiliary heads that simultaneously perform a depth-first left-to-right traversal of those subtrees. Similarly, if $\langle q,\omega,\chi(p)\rangle\to\delta(\langle q_{1},\alpha_{1}\rangle,\dots,\langle q_{m},\alpha_{m}\rangle)$ is an output rule of $M_{2}$ such that head $(q,{\rm stay})$ does not point to $\#$ , then $M$ checks that it points to a node with label $\delta$ and that the subtree at the $i$ -th child of that node equals the subtree at the head $(q_{i},\alpha_{i})$ , for every $i\in[1,m]$ . After checking the fixed point requirement (3), $M$ outputs the node $v$ and branches in the same way as $M_{1}$ . In the $i$ -th branch (apart from simulating $M_{1}$ ’s rule in the obvious way) it moves head $(q,{\rm up})$ to the position of head $(q,{\rm stay})$ and then moves head $(q,{\rm stay})$ to the position of head $(q,{\rm down}_{i})$ , for every $q\in Q$ , in linear time.

This ends the description of $M$ . It should be clear that $\tau_{M}$ is the set of all pairs $(\#(t,s),r)$ such that $(t,r)\in\tau_{1}$ (because $M$ simulates $M_{1}$ ) and $\tau_{2}(r)=s$ (because $M$ computes a tree fixed point $h$ of $G_{M_{2},r}$ such that $h(\langle q_{0},\mathrm{root}_{r}\rangle)=s$ ). Hence $\mathrm{dom}(M)=\{\#(t,s)\mid\exists\,r:(t,r)\in\tau_{1},\,\tau_{2}(r)=s\}=L_{\tau}$ . It remains to show that $M$ accepts $L_{\tau}$ in polynomial time.

There is a computation of $M_{1}$ of length at most $\#(P)\cdot|t|\cdot|r|$ that translates $t$ into $r$ , because if the number of move rules applied between two output rules is more than the number of configurations of $M_{1}$ on $t$ , then there is a loop in the computation that can be removed. Since $(\tau_{1},\tau_{2})$ is linear-bounded, we may assume that the size of $r$ is at most linear in the size of $s$ . Hence the length of that computation is polynomial in $|t|$ and $|s|$ , and hence in $|\#(t,s)|$ . Since $M$ simulates $M_{1}$ , and each simulated computation step takes linear time (as shown above), $M$ accepts $\#(t,s)$ in polynomial time. $\Box$

From Theorem 62 and Lemma 28, which says that $\mbox{\sf mrMT$ {}{\text{{\sc io}}} $}\subseteq\mbox{\sf f\,TT$ {}{\downarrow} $}\circ\mbox{\sf dTT}$ , we obtain the following corollary. Note that $\mbox{\sf TT}\circ\mbox{\sf dTT}$ is larger than $\mbox{\sf f\,TT$ {}{\downarrow} $}\circ\mbox{\sf dTT}$ in two respects. First, it contains non-finitary translations. Second, it contains total functions for which the height of the output tree can be double exponential in the height of the input tree, viz. $\tau_{\mathrm{exp}}^{2}$ in the proof of Proposition 7, whereas that is at most exponential for total functions in $\mbox{\sf TT$ {}{\downarrow} $}\circ\mbox{\sf dTT}$ by Theorem 35, Lemma 24, and the paragraph after Corollary 25.

Corollary 63

$\mbox{\sf MT$ {}{\text{{\sc io}}} $}\subseteq\mbox{\sf mrMT$ {}{\text{{\sc io}}} $}\subseteq\mbox{\sf LOGCFL}$ .

As another corollary we even obtain an upper bound on the complexity of the output languages of dTT that improves the one of Theorem 50. It was proved for attribute grammars in [25].

Corollary 64

$\mbox{\sf dTT}(\mbox{\sf REGT})\subseteq\mbox{\sf LOGCFL}$ .

**Proof. **Let $L$ be a regular tree language over $\Omega$ and let $\tau_{2}\subseteq T_{\Omega}\times T_{\Delta}$ be in dTT. Let $\Sigma=\{e\}$ with $\operatorname{rank}(e)=0$ , and let $\tau_{1}=\{(e,r)\mid r\in L\}$ . The one-state ttℓ with rules $\langle p,e,0\rangle\to\omega(\langle p,{\rm stay}\rangle,\dots,\langle p,{\rm stay}\rangle)$ for every $\omega\in\Omega$ realizes the translation $\{(e,r)\mid r\in T_{\Omega}\}$ , and hence $\tau_{1}\in\mbox{\sf TT}$ by Corollary 21. Let $\tau=\tau_{1}\circ\tau_{2}$ . Then $L_{\tau}=\{\#(e,s)\mid\exists r:r\in L,\,\tau_{2}(r)=s\}=\{\#(e,s)\mid s\in\tau_{2}(L)\}$ . By Theorem 62 $L_{\tau}\in\mbox{\sf LOGCFL}$ , and hence $\tau_{2}(L)\in\mbox{\sf LOGCFL}$ because $\tau_{2}(L)$ is log-space reducible to $L_{\tau}$ . $\Box$

Theorem 62 and Corollary 64 can be extended to deal with the yields of the output trees, as also proved in [25] for attribute grammars (generalizing the proof in [3] of $\mbox{\sf IO}(1)\subseteq\mbox{\sf LOGCFL}$ ). For a ranked alphabet $\Sigma$ we define the mapping $y_{\Sigma}:T_{\Sigma}\to(\Sigma^{(0)})^{*}$ such that $y_{\Sigma}(t)=yt$ , the yield of $t$ . Let yield be the class of all such mappings $y_{\Sigma}$ . In what follows we will identify each string $w$ with the monadic tree $\operatorname{mon}(w)$ as defined in the proof of Corollary 51. Hence, as mentioned in that proof, $\mbox{\sf yield}\subseteq\mbox{\sf dTT$ {}^{\ell} $}$ . This even holds if we assume the existence of special symbols in $\Sigma^{(0)}$ that are skipped when taking the yield of $t$ (such as the symbols $X_{0}$ in the derivation trees of context-free grammars with $\varepsilon$ -rules, cf. Section 2).

Corollary 65

$\mbox{\sf TT}\circ\mbox{\sf dTT}\circ\mbox{\sf yield}\subseteq\mbox{\sf LOGCFL}$ * and $y\mbox{\sf dTT}(\mbox{\sf REGT})\subseteq\mbox{\sf LOGCFL}$ .*

**Proof. **It is straightforward to show that $\mbox{\sf yield}\subseteq\mbox{\sf dTT$ {}_{\mathrm{pru}} $}\ast\mbox{\sf yield}$ . In fact, the deterministic pruning tt removes all nodes of rank 1 and, using regular look-ahead, all subtrees of which the yield is the empty string $\varepsilon$ (due to the special symbols mentioned above). Consequently, as in the proof of Corollary 38, $\mbox{\sf TT}\circ\mbox{\sf dTT}\circ\mbox{\sf yield}=\mbox{\sf TT}\ast(\mbox{\sf dTT}\circ\mbox{\sf yield})$ . This allows us to repeat the proof of Theorem 62, this time with respect to the forward deterministic context-free grammar $G^{\prime}_{M_{2},r}$ that generates the yields of the trees generated by $G_{M_{2},r}$ : if $X\to\zeta$ is a rule of $G_{M_{2},r}$ , then $X\to y\zeta$ is a rule of $G^{\prime}_{M_{2},r}$ . Thus, this time the mhtt $M$ guesses a fixed point $h$ of $G^{\prime}_{M_{2},r}$ , rather than a tree fixed point. To do this it uses two heads $\langle q,{\rm stay},\text{left}\rangle$ and $\langle q,{\rm stay},\text{right}\rangle$ instead of the one head $\langle q,{\rm stay}\rangle$ , to guess the left- and right-end of the substring generated by the configuration $\langle q,v\rangle$ , and similarly for ${\rm up}$ and ${\rm down}_{i}$ . It should be clear that the fixed point requirement (3) can easily be checked, showing that one such substring equals another one or is the concatenation of several other ones. $\Box$

The inclusion $\mbox{\sf TT}\circ\mbox{\sf dTT}\subseteq\mbox{\sf LOGCFL}$ of Theorem 62 has consequences for both space and time complexity. We first consider space complexity.

Since $\mbox{\sf LOGCFL}\subseteq\mbox{\sf DSPACE}(n)$ , we obtain that $\mbox{\sf TT}\subseteq\mbox{\sf DSPACE}(n)$ from Theorem 62. This can easily be generalized to arbitrary compositions of tt’s.

Theorem 66

For every $k\geq 1$ , $\mbox{\sf TT}^{k}\subseteq\mbox{\sf DSPACE}(n)$ .

**Proof. **The proof is by induction on $k$ , with an induction step similar to the one in the proof of Theorem 47.

Let $\tau=\tau_{1}\circ\tau_{2}$ such that $\tau_{1}\in\mbox{\sf TT}$ and $\tau_{2}\in\mbox{\sf TT}^{k}$ , $k\geq 1$ . For a given input string $\#ts$ it has to be checked whether $(t,s)\in\tau$ . By Corollary 38(1) we may assume that $(\tau_{1},\tau_{2})$ is linear-bounded. Hence there is a constant $c\in{\mathbb{N}}$ such that for every $(t,s)\in\tau$ there is an intermediate tree $r$ such that $|r|\leq c\cdot|s|$ . To check whether $(t,s)\in\tau$ a deterministic Turing machine systematically enumerates all trees $r$ such that $|r|\leq c\cdot|s|$ (cf. the proof of Theorem 50). For each such $r$ it can check in linear space whether $(t,r)\in\tau_{1}$ by the case $k=1$ . Moreover, by induction it can check in linear space whether $(r,s)\in\tau_{2}$ . Thus it uses space $O(|t|+|r|)+O(|r|+|s|)=O(|t|+|s|)$ . $\Box$

This result allows us to prove one of our main results, viz. that the output languages of $\mbox{\sf TT}^{k}$ are in $\mbox{\sf DSPACE}(n)$ , originally proved in [48]. It generalizes the main result of [4] from classical top-down tree transducers to tree-walking tree transducers and macro tree transducers.

Theorem 67

For every $k\geq 1$ ,

[TABLE]

**Proof. **The proof is similar to the one of Theorem 50. Let $L\in\mbox{\sf REGT}$ and $\tau\in\mbox{\sf TT}^{k}$ . By Corollary 38(1), $\tau=\tau_{1}\circ\tau_{2}$ such that $\tau_{1}\in\mbox{\sf TT$ {}_{\mathrm{pru}} $}$ , $\tau_{2}\in\mbox{\sf TT}^{k}$ , and $(\tau_{1},\tau_{2})$ is linear-bounded for some constant $c$ . Let $L^{\prime}=\tau_{1}(L)$ , and note that $\tau(L)=\tau_{2}(L^{\prime})$ and that $L^{\prime}\in\mbox{\sf REGT}$ by Lemma 15. It is straightforward to show that for every $s\in\tau(L)$ there exists $t\in L^{\prime}$ such that $(t,s)\in\tau_{2}$ and $|t|\leq c\cdot|s|$ . To check whether a given tree $s$ is in $\tau(L)$ , a deterministic Turing machine enumerates all input trees $t$ (of $\tau_{2}$ ) such that $|t|\leq c\cdot|s|$ . For each such $t$ it first checks that $t\in L^{\prime}$ in space $O(|t|)=O(|s|)$ . Then it uses the algorithm of Theorem 66 to check that $(t,s)\in\tau_{2}$ in space $O(|t|+|s|)=O(|s|)$ .

The inclusion for $\mbox{\sf MT}^{k}$ is now immediate from Lemma 27. $\Box$

As before, Theorems 66 and 67 also hold for io (multi-return) macro tree translations, pebble tree translations, and high-level tree translations, which can be realized by compositions of tt’s (see Section 9).

By the proof of Corollary 51, Theorem 67 implies that

[TABLE]

for every $k\geq 1$ . Hence the oi-hierarchy is also contained in $\mbox{\sf DSPACE}(n)$ , cf. Corollaries 59 and 52.

Corollary 68

For every $k\geq 1$ , $\mbox{\sf OI}(k)\subseteq\mbox{\sf DSPACE}(n)$ .

Next we consider time complexity. Since $\mbox{\sf LOGCFL}\subseteq\mbox{\sf PTIME}$ , it follows from Theorem 62 that $\mbox{\sf TT}\circ\mbox{\sf dTT}\subseteq\mbox{\sf PTIME}$ . This result can be generalized as follows.

One way to increase the power of the tt is to give it a more powerful feature of look-around. For a class ${\cal L}$ of tree or string languages, we define the tt with ${\cal L}$ look-around by allowing the tt to use node tests $T$ such that $\operatorname{mark}(T)\in{\cal L}$ . Similarly we obtain the mhtt with ${\cal L}$ look-around. We now consider in particular the case where ${\cal L}=\mbox{\sf PTIME}$ . Obviously, (the proof of) the first sentence of Lemma 60 is still valid for a multi-head tt $M$ with PTIME look-around. Thus, the domain of an mhtt with PTIME look-around is in PTIME, and hence, in particular, the domain of a tt with PTIME look-around is in PTIME. This implies that Lemma 19, and hence Theorem 20, also holds if the first transducer has PTIME look-around. From the proof of Theorem 62 it now easily follows that $\mbox{\sf TT$ {}^{\text{P}} $}\circ\mbox{\sf dTT}\subseteq\mbox{\sf PTIME}$ , where the feature of PTIME look-around is indicated by a superscript P. This, in its turn, implies the following variant of Corollary 63 for (multi-return) io macro tree transducers with PTIME look-around (appropriately defined): $\mbox{\sf MT$ {}{\text{{\sc io}}}^{\text{P}} $}\subseteq\mbox{\sf mrMT$ {}{\text{{\sc io}}}^{\text{P}} $}\subseteq\mbox{\sf PTIME}$ . Examples of tree languages in PTIME that can be used as look-around are those in $\mbox{\sf dTT}(\mbox{\sf REGT})$ , by Corollary 64, and the tree languages defined by bottom-up tree automata with equality and disequality constraints ([8]), which can obviously be accepted by a multi-head tt.

In the remainder of this section we show that there are translations in $\mbox{\sf dTT}\circ\mbox{\sf TT}$ , even in $\mbox{\sf dTT$ {}_{\downarrow} $}\circ\mbox{\sf TT}$ , for which the membership problem is NP-complete. We will use a reduction of SAT, the satisfiability problem of boolean formulas (see, e.g., [42]), to such a membership problem.

Let $\Delta=\{\vee,\wedge,\neg,\mathsf{v},\mathsf{e}\}$ with $\Delta^{(2)}=\{\vee,\wedge\}$ , $\Delta^{(1)}=\{\neg,\mathsf{v}\}$ , and $\Delta^{(0)}=\{\mathsf{e}\}$ . Let ${\cal B}$ be the set of all trees over $\Delta$ generated by the regular tree grammar with nonterminals $F$ and $V$ , initial nonterminal $F$ , and rules $F\to\vee(F,F)$ , $F\to\wedge(F,F)$ , $F\to\neg(F)$ , $F\to V$ , $V\to\mathsf{v}(V)$ , and $V\to\mathsf{v}(\mathsf{e})$ . Thus, ${\cal B}$ is the set of all boolean formulas that use boolean variables of the form $\mathsf{v}^{\ell}\mathsf{e}$ for $\ell\geq 1$ . For a boolean formula $\varphi$ we define $\nu(\varphi)$ to be the nesting-depth of its boolean operators, i.e., $\nu(\varphi)=0$ if $\varphi$ is a variable, $\nu(\vee(\varphi_{1},\varphi_{2}))=\nu(\wedge(\varphi_{1},\varphi_{2}))=\max\{\nu(\varphi_{1}),\nu(\varphi_{2})\}+1$ , and $\nu(\neg(\varphi))=\nu(\varphi)+1$ . For every $m\geq 0$ and $n\geq 1$ , let ${\cal B}(m,n)$ be the set of all formulas $\varphi\in{\cal B}$ such that $\nu(\varphi)\leq m$ , and $\ell\in[1,n]$ for every $\mathsf{v}^{\ell}\mathsf{e}$ that occurs in $\varphi$ . Thus, the formulas in ${\cal B}(m,n)$ have nesting-depth at most $m$ and use at most the variables $\mathsf{v}\mathsf{e},\mathsf{v}\mathsf{v}\mathsf{e},\dots,\mathsf{v}^{n}\mathsf{e}$ .

The proof of the next lemma is essentially a variant of the one of [74, Theorem 3.1]. Let $\Sigma=\{c,d,0,1,a\}$ with $\Sigma^{(1)}=\{c,d,0,1\}$ and $\Sigma^{(0)}=\{a\}$ .

Lemma 69

There is a translation $\tau\in\mbox{\sf f\,TT$ {}^{\ell}_{\downarrow} $}$ such that, for every $m\geq 0$ and every string $w\in\{0,1\}^{*}$ of length $n\geq 1$ , the set $\tau(d^{m}cwa)$ consists of all boolean formulas $\varphi\in{\cal B}(m,n)$ such that $\varphi$ is true when the value of $\mathsf{v}^{\ell}\mathsf{e}$ is the $\ell$ -th symbol of $w$ for every $\ell\in[1,n]$ .

**Proof. **We construct the top-down local tt $M=(\Sigma,\Delta,\{q_{0},q_{1}\},\{q_{1}\},R)$ . Note that the initial state is $q_{1}$ . The boolean operations $i\vee j$ , $i\wedge j$ , and $\neg\,i$ on $\{0,1\}$ are defined as usual, where [math] stands for ‘false’ and $1$ for ‘true’. Since the child numbers of the nodes of the input tree will be irrelevant, we omit them from the left-hand sides of the rules of $M$ . The only instruction used in the right-hand sides of the rules is $\alpha={\rm down}_{1}$ . The rules are the following, for every $i,j\in\{0,1\}$ .

[TABLE]

Let $u$ be the node of the input tree $t=d^{m}cwa$ with label $c$ . After consuming $d^{m}$ , the tt $M$ has nondeterministically generated any output form that is a boolean formula $\varphi$ of nesting-depth at most $m$ and with the two configurations $\langle q_{i},u\rangle$ as variables, such that $\varphi$ is true when the value of $\langle q_{i},u\rangle$ is $i$ . For instance, in the first step of that computation $M$ consumes $d$ and changes the initial output form $\langle q_{1},\mathrm{root}_{t}\rangle$ into one of the output forms $\vee(\langle q_{1},x\rangle,\langle q_{0},x\rangle)$ , $\vee(\langle q_{0},x\rangle,\langle q_{1},x\rangle)$ , $\vee(\langle q_{1},x\rangle,\langle q_{1},x\rangle)$ , $\wedge(\langle q_{1},x\rangle,\langle q_{1},x\rangle)$ , $\neg(\langle q_{0},x\rangle)$ , or $\langle q_{1},x\rangle$ , where $x$ is the child of $\mathrm{root}_{t}$ . After that, each $\langle q_{i},u\rangle$ generates any variable $\mathsf{v}^{\ell}\mathsf{e}$ such that the $\ell$ -th symbol of $w$ is $i$ . Note that since $i$ and $j$ are not necessarily distinct, $M$ has in particular the rule $\langle q_{i},i\rangle\to\mathsf{v}(\langle q_{i},\alpha\rangle)$ for every $i\in\{0,1\}$ . Thus, $q_{i}$ can nondeterministically choose any occurrence of $i$ in $w$ to output $\mathsf{e}$ and end the computation. $\Box$

Applying the translation $\tau$ of Lemma 69 to the regular tree language $L$ consisting of all trees $d^{m}cwa$ such that $m\geq 0$ and $w$ is a nonempty string over $\{0,1\}$ , produces the set $\tau(L)$ of all satisfiable formulas in ${\cal B}$ . Thus, since the membership problem for that set is NP-complete, we obtain the following corollary that was proved in [67], as already mentioned after Corollary 59. Note that it is easy to prove that $\mbox{\sf f\,TT$ {}{\downarrow} $}(\mbox{\sf REGT})\subseteq y\mbox{\sf f\,TT$ {}{\downarrow} $}(\mbox{\sf REGT})$ : just change every output rule $\langle q,\sigma,j,T\rangle\to\delta(\langle q_{1},\alpha_{1}\rangle,\dots,\langle q_{k},\alpha_{k}\rangle)$ into the (general) rule $\langle q,\sigma,j,T\rangle\to\omega_{k+1}(\delta,\langle q_{1},\alpha_{1}\rangle,\dots,\langle q_{k},\alpha_{k}\rangle)$ where $\omega_{k+1}$ has rank $k+1$ (and $\delta$ now has rank [math]).

Corollary 70

There is an NP-complete language in $\mbox{\sf f\,TT$ {}{\downarrow} $}(\mbox{\sf REGT})$ , and hence there is one in $y\mbox{\sf f\,TT$ {}{\downarrow} $}(\mbox{\sf REGT})$ .

We now prove the existence of a translation in $\mbox{\sf dTT$ {}_{\downarrow} $}\circ\mbox{\sf TT}$ for which the membership problem is NP-complete. Recall that, for a tree translation $\tau$ , we denote by $L_{\tau}$ the tree language $\{\#(t,s)\mid(t,s)\in\tau\}$ .

Theorem 71

There is a translation $\tau\in\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}\circ\mbox{\sf dTT$ {}^{\ell} $}\circ\mbox{\sf TT$ {}^{\ell}{\mathrm{pru}} $}\subseteq\mbox{\sf dTT$ {}_{\downarrow} $}\circ\mbox{\sf f\,TT}$ such that $L_{\tau}$ is NP-complete.

**Proof. **The inclusion $\mbox{\sf dTT$ {}^{\ell} $}\circ\mbox{\sf TT$ {}^{\ell}{\mathrm{pru}} $}\subseteq\mbox{\sf f\,TT}$ is immediate from Lemma 19. We first describe a translation $\tau\in\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}\circ\mbox{\sf f\,TT$ {}^{\ell} $}$ such that $L_{\tau}$ is NP-complete. Let $\Gamma=\{a,b,c,d,e\}$ with $\Gamma^{(1)}=\{a,b,c,d\}$ and $\Gamma^{(0)}=\{e\}$ . The translation $\tau\subseteq T_{\Gamma}\times T_{\Delta}$ transforms each tree $t=ab^{n}cd^{m}e$ into all satisfiable boolean formulas in ${\cal B}(m,n)$ . This will be realized by the composition of two tt’s $M_{1}$ and $M_{2}$ such that the deterministic tt $M_{1}$ transforms $t$ into a tree $s$ of which the path language171717The path language of a tree $s\in T_{\Omega}$ consists of all strings in $\Omega^{*}$ that are obtained by walking along a path from the root of $s$ to one of its leaves, writing down the labels of the nodes of that path from left to right.

consists of all strings $awcd^{m}e$ with $w\in\{0,1\}^{*}$ of length $n$ , and $M_{2}$ nondeterministically chooses a leaf of $s$ and then walks back to the root of $s$ while simulating the transducer $M$ of (the proof of) Lemma 69 on the tree $d^{m}cwa\in T_{\Sigma}$ . Thus, $M_{1}$ provides all possible valuations of the variables $\mathsf{v}\mathsf{e},\mathsf{v}\mathsf{v}\mathsf{e},\dots,\mathsf{v}^{n}\mathsf{e}$ and $M_{2}$ chooses one such valuation and produces all formulas in ${\cal B}(m,n)$ that are true for that valuation.

Let $\Omega=\{a,0,1,c,d,e\}$ with $\Omega^{(2)}=\{a,0,1\}$ , $\Omega^{(1)}=\{c,d\}$ , and $\Omega^{(0)}=\{e\}$ . We define $\tau=\tau_{M_{1}}\circ\tau_{M_{2}}\subseteq T_{\Gamma}\times T_{\Delta}$ where $M_{1}$ and $M_{2}$ are the following tt’s. The deterministic tt ${}^{\ell}_{\downarrow}$ $M_{1}=(\Gamma,\Omega,\{q,q_{0},q_{1},p\},\{q\},R_{1})$ has the following rules, for $i\in\{0,1\}$ and $\alpha={\rm down}_{1}$ .

[TABLE]

It should be clear that for an input tree $ab^{n}cd^{m}e$ , with $m\geq 0$ and $n\geq 1$ , the path language of the tree $\tau_{M_{1}}(ab^{n}cd^{m}e)$ consists of all strings $awcd^{m}e$ with $w\in\{0,1\}^{*}$ of length $n$ .

The ttℓ $M_{2}=(\Omega,\Delta,Q,Q_{0},R_{2})$ has states $Q=\{q_{2},q_{0},q_{1}\}$ and $Q_{0}=\{q_{2}\}$ . On an input tree $\tau_{M_{1}}(ab^{n}cd^{m}e)$ , it walks nondeterministically in state $q_{2}$ from the root to some leaf (without producing output), moves to the parent of that leaf, and then simulates the transducer $M$ of Lemma 69 on the tree $d^{m}cwa\in T_{\Sigma}$ while walking back to the root. It starts that simulation in the state $q_{1}$ of $M$ , and then uses the rules of $M$ with $\alpha={\rm up}$ .

With this definition of $M_{1}$ and $M_{2}$ , it follows from Lemma 69 that the set $\tau(ab^{n}cd^{m}e)$ consists of all boolean formulas $\varphi\in{\cal B}(m,n)$ such that $\varphi$ is satisfiable. Thus, for a formula $\varphi\in{\cal B}(m,n)$ , $\varphi$ is satisfiable if and only if $\#(ab^{n}cd^{m}e,\varphi)$ is in $L_{\tau}$ . This shows that satisfiability is reducible to membership in $L_{\tau}$ , because the nesting-depth $m$ of $\varphi$ and the number $n$ of variables it uses, can easily be computed from any $\varphi\in{\cal B}$ in polynomial time.

We finally show that $\tau_{M_{2}}\in\mbox{\sf dTT$ {}^{\ell} $}\circ\mbox{\sf TT$ {}^{\ell}_{\mathrm{pru}} $}$ , by a standard technique (see, e.g., [34, Section 6.1]). In fact, we will show that $\tau_{M_{2}}\in\mbox{\sf dTT$ {}^{\ell} $}\circ\mbox{\sf SET}$ , cf. the proof of Lemma 27. Let $+$ be a new symbol of rank 2, and $\theta$ a new symbol of rank 0. Let $M_{2}^{\prime}$ be the deterministic ttℓ with output alphabet $\Delta\cup\{+,\theta\}$ that is obtained from $M_{2}$ as follows. For every triple $\langle q,\omega,j\rangle$ such that $q\in Q$ , $\omega\in\Omega$ , and $j\in[0,{\mathit{m}x}_{\Omega}]$ , if $\langle q,\omega,j\rangle\to\zeta_{1},\dots,\langle q,\omega,j\rangle\to\zeta_{r}$ are all the rules of $M_{2}$ with left-hand side $\langle q,\omega,j\rangle$ , then $M^{\prime}_{2}$ has the rule $\langle q,\omega,j\rangle\to+(\zeta_{1},+(\zeta_{2},\zeta_{3}))$ if $r=3$ , the rule $\langle q,\omega,j\rangle\to+(\zeta_{1},\zeta_{2})$ if $r=2$ , the rule $\langle q,\omega,j\rangle\to\zeta_{1}$ if $r=1$ , and the rule $\langle q,\omega,j\rangle\to\theta$ if $r=0$ . Let $M_{3}$ be the pruning tt with one state $p$ and rules $\langle p,\delta,j\rangle\to\delta(\langle p,{\rm down}_{1}\rangle,\dots,\langle p,{\rm down}_{k}\rangle)$ for every $\delta\in\Delta^{(k)}$ , plus the rules $\langle p,+,j\rangle\to\langle p,{\rm down}_{1}\rangle$ and $\langle p,+,j\rangle\to\langle p,{\rm down}_{2}\rangle$ (for every child number $j$ ). Since $M_{2}$ first moves from the root to a leaf, and then moves back to the root, it does not have infinite computations. From that it should be clear that $\tau_{M_{2}}=\tau_{M^{\prime}_{2}}\circ\tau_{M_{3}}$ . $\Box$

Corollary 72

There is a translation $\tau\in\mbox{\sf MT}$ such that $L_{\tau}$ is NP-complete.

**Proof. **By Lemma 24, $\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}\circ\mbox{\sf dTT$ {}^{\ell} $}\subseteq\mbox{\sf dMT}$ . Moreover, by [34, Theorem 7.6(3)], $\mbox{\sf dMT}\circ\mbox{\sf TT$ {}^{\ell}{\mathrm{pru}} $}\subseteq\mbox{\sf MT}$ . Hence the translation $\tau$ of Theorem 71 is in MT. $\Box$

Since $\mbox{\sf MT}\subseteq\mbox{\sf MT$ {}^{2}{\text{{\sc io}}} $}$ by [34, Theorem 6.10], this also shows that there is a translation $\tau\in\mbox{\sf MT$ {}^{2}{\text{{\sc io}}} $}$ such that $L_{\tau}$ is NP-complete, cf. Corollary 63.

13 Forest Transducers

Whereas we have considered ranked trees until now, i.e., trees over a ranked alphabet, XML documents naturally correspond to unranked trees or forests, over an ordinary unranked alphabet. For that reason we now consider transducers that transform forests into forests. Rather than generalizing the tt to a “forest-walking forest transducer”, we take the equivalent, natural approach of letting the tt transform representations of forests by (ranked) trees, cf. [63] and [28, Section 11].

For an ordinary (unranked) alphabet $\Sigma$ the set $F_{\Sigma}$ of forests over $\Sigma$ is the language generated by the context-free grammar with nonterminals $F$ and $T$ , initial nonterminal $F$ , set of terminals $\Sigma\cup\{[\,,]\}$ , where $\{[\,,]\}$ is the set consisting of the left and right square bracket, and rules $F\to\varepsilon$ , $F\to TF$ , and $T\to\sigma[F]$ for every $\sigma\in\Sigma$ . Thus, intuitively, a forest is a sequence of unranked trees, and an unranked tree is of the form $\sigma[t_{1}\cdots t_{n}]$ where each $t_{i}$ is an unranked tree. Note that every forest $f\in F_{\Sigma}$ can be uniquely written as $f=\sigma[f_{1}]f_{2}$ with $\sigma\in\Sigma$ and $f_{1},f_{2}\in F_{\Sigma}$ .

As usual, forests can be encoded as binary trees. With $\Sigma$ we associate the ranked alphabet $\Sigma_{e}=\Sigma\cup\{e\}$ where $e$ has rank 0 and every $\sigma\in\Sigma$ has rank 2. The mapping ${\rm enc}_{\Sigma}:F_{\Sigma}\to T_{\Sigma_{e}}$ is defined as follows. The encoding of the empty forest is ${\rm enc}_{\Sigma}(\varepsilon)=e$ , and recursively, the encoding of a forest $f=\sigma[f_{1}]f_{2}$ is ${\rm enc}_{\Sigma}(f)=\sigma({\rm enc}_{\Sigma}(f_{1}),{\rm enc}_{\Sigma}(f_{2}))$ . The mapping ${\rm enc}_{\Sigma}$ is a bijection, and the inverse decoding is denoted by ${\rm dec}_{\Sigma}$ . Let enc and dec denote the classes of encodings ${\rm enc}_{\Sigma}$ and decodings ${\rm dec}_{\Sigma}$ , respectively, for all alphabets $\Sigma$ . We define $\mbox{\sf FT}=\mbox{\sf enc}\circ\mbox{\sf TT}\circ\mbox{\sf dec}$ to be the class of tt forest translations. Thus, a tt forest translation is of the form $\tau={\rm enc}_{\Sigma}\circ\tau_{M}\circ{\rm dec}_{\Delta}$ where $\Sigma$ and $\Delta$ are alphabets and $M$ is a tt with input alphabet $\Sigma_{e}$ and output alphabet $\Delta_{e}$ , which in this context can be called a tt forest transducer. We first restrict attention to deterministic tt forest transducers, i.e., to the class $\mbox{\sf dFT}=\mbox{\sf enc}\circ\mbox{\sf dTT}\circ\mbox{\sf dec}$ .

The next simple lemma shows that the encodings of compositions are the compositions of encodings (of deterministic tt’s).

Lemma 73

For every $k\geq 1$ , $\mbox{\sf dFT}^{k}=\mbox{\sf enc}\circ\mbox{\sf dTT}^{k}\circ\mbox{\sf dec}$ .

**Proof. **The inclusion $\mbox{\sf dFT}^{k}\subseteq\mbox{\sf enc}\circ\mbox{\sf dTT}^{k}\circ\mbox{\sf dec}$ is obvious, because ${\rm dec}_{\Delta}\circ{\rm enc}_{\Delta}$ is the identity on $T_{\Delta_{e}}$ for every (unranked) alphabet $\Delta$ . To show that $\mbox{\sf enc}\circ\mbox{\sf dTT}^{k}\circ\mbox{\sf dec}\subseteq\mbox{\sf dFT}^{k}$ , it suffices to prove that $\mbox{\sf dTT}\circ\mbox{\sf dTT}\subseteq\mbox{\sf dTT}\circ\mbox{\sf dec}\circ\mbox{\sf enc}\circ\mbox{\sf dTT}$ . Let $\Gamma$ be the (ranked) output alphabet of a first transducer, which is also the input alphabet of the second, and let $\mathrm{id}_{\Gamma}$ be the identity on $T_{\Gamma}$ . By the composition results of Theorems 18 and 23, it now suffices to show that $\mathrm{id}_{\Gamma}\in\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}\circ\mbox{\sf dec}\circ\mbox{\sf enc}\circ\mbox{\sf dTT$ {}^{\ell}{\mathrm{su}} $}$ . We do this by encoding the trees over $\Gamma$ as binary trees, similar to the transformation of the derivation trees of a context-free grammar into those of its Chomsky Normal Form. Let $\omega$ be a new symbol, and let $\Delta$ be the unranked alphabet $\Gamma\cup\{\omega\}$ . We encode the trees over $\Gamma$ as trees over the ranked alphabet $\Delta_{e}$ , which are the usual encodings of forests over $\Delta$ . The encoding $h:T_{\Gamma}\to T_{\Delta_{e}}$ is defined as follows: for every $\gamma\in\Gamma^{(k)}$ , if $h(t_{i})=t^{\prime}_{i}$ for every $i\in[1,k]$ , then $h(\gamma(t_{1},t_{2},\dots,t_{k}))=\gamma(e,\omega(t^{\prime}_{1},\omega(t^{\prime}_{2},\dots\omega(t^{\prime}_{k},e)\cdots)))$ . It should be clear that $h$ is an injection. It should also be clear that $h\in\mbox{\sf dTT$ {}^{\ell}_{\downarrow} $}$ (in fact, $h$ is a tree homomorphism, which can be realized by a classical top-down tree transducer). Finally, it is also easy to construct a local top-down single-use tt $M$ such that $\tau_{M}(h(t))=t$ for every $t\in T_{\Gamma}$ . It has the set of states $Q=\{q_{i}\mid i\in[0,{\mathit{m}x}_{\Gamma}]\}$ with initial state $q_{0}$ , and the following rules (where $\gamma\in\Gamma^{(k)}$ , $j\in[0,2]$ , and $q_{i}\in Q$ , $i\neq 1$ ):

[TABLE]

Note that $\gamma$ and $\omega$ have rank 2 in $\Delta_{e}$ . $\Box$

We now wish to show that our main results also hold for deterministic tt forest translations. Let us first consider the complexity results of Section 10. It is easy to see that for every alphabet $\Sigma$ , the mappings ${\rm enc}_{\Sigma}$ and ${\rm dec}_{\Sigma}$ can be computed by a deterministic Turing machine in linear time and space, simulating a one-way pushdown transducer.181818In fact, ${\rm enc}_{\Sigma}$ can even be computed without pushdown: for every forest $f\in F_{\Sigma}$ , ${\rm enc}_{\Sigma}(f)$ can be obtained from $f$ by removing all left-brackets, changing each right-bracket into $e$ , and adding one $e$ at the end.

This implies, by Lemma 73, that Theorems 47 and 49 also hold for $\mbox{\sf dFT}^{k}$ . We define a set of forests $L\subseteq F_{\Sigma}$ to be a regular forest language if ${\rm enc}_{\Sigma}(L)\in\mbox{\sf REGT}$ , and we denote the class of regular forest languages by REGF. Then, for every $k\geq 1$ , the class $\mbox{\sf dFT}^{k}(\mbox{\sf REGF})$ of output forest languages is included in the class $\mbox{\sf dec}(\mbox{\sf dTT}^{k}(\mbox{\sf REGT}))$ by Lemma 73. Let $L\in\mbox{\sf REGT}$ and $\tau\in\mbox{\sf dTT}^{k}$ with output alphabet $\Delta_{e}$ . Then a forest $f$ over $\Delta$ is in ${\rm dec}_{\Delta}(\tau(L))$ if and only if ${\rm enc}_{\Delta}(f)$ is in $\tau(L)$ . That implies that Theorem 50 also holds for $\mbox{\sf dFT}^{k}$ , in the sense that $\mbox{\sf dFT}^{k}(\mbox{\sf REGF})\subseteq\mbox{\sf DSPACE}(n)$ .

Next we consider the results of Section 9, and extend the class LSIF in the obvious way to forest translations. Since it is easy to show that for every forest $f\in F_{\Sigma}$ , we have $|{\rm enc}_{\Sigma}(f)|=\frac{2}{3}|f|+1$ (see footnote 18), a translation $\tau^{\prime}={\rm enc}_{\Sigma}\circ\tau\circ{\rm dec}_{\Delta}$ is of linear size increase if and only if $\tau$ is of linear size increase. Thus, since $\mbox{\sf dFT}^{k}=\mbox{\sf enc}\circ\mbox{\sf dTT}^{k}\circ\mbox{\sf dec}$ by Lemma 73, it is decidable for a given composition of deterministic tt forest transducers whether or not it is of linear size increase. And if so, an equivalent deterministic tt forest transducer can be constructed: $\mbox{\sf dFT}^{k}\cap\mbox{\sf LSIF}=\mbox{\sf enc}\circ(\mbox{\sf dTT}^{k}\cap\mbox{\sf LSIF})\circ\mbox{\sf dec}=\mbox{\sf enc}\circ\mbox{\sf dTT$ {}{\mathrm{su}} $}\circ\mbox{\sf dec}\subseteq\mbox{\sf dFT}$ . Intuitively, $\mbox{\sf enc}\circ\mbox{\sf dTT$ {}{\mathrm{su}} $}\circ\mbox{\sf dec}$ is the class of translations realized by “single-use forest-walking forest transducers”. Since $\mbox{\sf dTT$ {}_{\mathrm{su}} $}=\mbox{\sf dMSOT}$ by Proposition 29, it is also the class $\mbox{\sf enc}\circ\mbox{\sf dMSOT}\circ\mbox{\sf dec}$ . Viewing forests as graphs, and hence as logical structures, in the obvious way (just as trees), every encoding ${\rm enc}_{\Sigma}$ and every decoding ${\rm dec}_{\Sigma}$ is a deterministic (i.e., parameterless) mso translation, as defined in [14, Chapter 7]. Hence, by the closure of mso translations under composition [14, Theorem 7.14], $\mbox{\sf enc}\circ\mbox{\sf dMSOT}\circ\mbox{\sf dec}$ equals the (natural) class of deterministic mso translations from forests to forests.

As observed in [65] for macro tree transducers, whereas the encoding of forests as binary trees is quite natural for the input forest of a tt, for the output forest it is less natural, because it forces the tt to generate the output forest $f$ in its unique form $f=\sigma[f_{1}]f_{2}$ . It is more natural to additionally allow the tt to generate $f$ as a concatenation $f_{1}f_{2}$ of two forests $f_{1}$ and $f_{2}$ . To formalize this, as in [26, Section 7] and in accordance with [65], we associate with an alphabet $\Delta$ the ranked alphabet $\Delta_{@}=\Delta\cup\{@,e\}$ where $@$ has rank 2, $e$ has rank 0, and every $\delta\in\Delta$ has rank 1. The mapping ${\rm flat}_{\Delta}:T_{\Delta_{@}}\to F_{\Delta}$ is a “flattening” defined as follows (for $t_{1},t_{2}\in T_{\Delta_{@}}$ and $\delta\in\Delta$ ): ${\rm flat}_{\Delta}(e)=\varepsilon$ , ${\rm flat}_{\Delta}(@(t_{1},t_{2}))={\rm flat}_{\Delta}(t_{1}){\rm flat}_{\Delta}(t_{2})$ , the concatenation of ${\rm flat}_{\Delta}(t_{1})$ and ${\rm flat}_{\Delta}(t_{2})$ , and ${\rm flat}_{\Delta}(\delta(t_{1}))=\delta[{\rm flat}_{\Delta}(t_{1})]$ . The mapping ${\rm flat}_{\Delta}$ is surjective but, in general, not injective. Let flat denote the class of flattenings ${\rm flat}_{\Delta}$ , for all alphabets $\Delta$ . We define $\mbox{\sf FT}_{@}=\mbox{\sf enc}\circ\mbox{\sf TT}\circ\mbox{\sf flat}$ to be the class of extended tt forest translations. An extended tt forest tree transducer is a tt with input alphabet $\Sigma_{e}$ and output alphabet $\Delta_{@}$ . Again, we first restrict attention to deterministic transducers, i.e., to the class $\mbox{\sf dFT}_{@}=\mbox{\sf enc}\circ\mbox{\sf dTT}\circ\mbox{\sf flat}$ .

Let us show that there is an extended tt forest translation in $\mbox{\sf dFT}_{@}$ that is not in dFT. That was shown for macro tree transducers in [65, Theorem 8] by a similar argument. Let $\Gamma=\{\sigma\}$ and $\Omega=\{\delta\}$ be alphabets, and let us identify the forest $\sigma[\;]$ with the symbol $\sigma$ , and similarly $\delta[\;]$ with $\delta$ . Then $\Gamma^{*}\subseteq F_{\Gamma}$ and $\Omega^{*}\subseteq F_{\Omega}$ . There is a deterministic extended tt forest transducer that translates the string $\sigma^{n}$ into the string $\delta^{2^{n+1}}$ for every $n\in{\mathbb{N}}$ . In fact, let $M$ be the dtt (with general rules) that is obtained from the dtt $M_{\mathrm{exp}}$ of Example 5 by changing its output alphabet into $\Omega_{@}=\{@,\delta,e\}$ , and changing $\sigma$ into $@$ and $e$ into $\delta(e)$ in the right-hand sides of its rules. Note that the input alphabet $\Sigma$ of $M_{\mathrm{exp}}$ and $M$ equals $\Gamma_{e}$ . The input tree $t_{n}={\rm enc}_{\Gamma}(\sigma^{n})=\sigma(e,\sigma(e,\dots\sigma(e,e)\cdots))$ is translated by $M_{\mathrm{exp}}$ into the full binary tree $s_{n}$ over $\Sigma$ with $2^{n+1}$ leaves. Clearly, $M$ translates $t_{n}$ into the tree $s^{\prime}_{n}$ that is obtained from $s_{n}$ by changing every $\sigma$ into $@$ and every $e$ into $\delta(e)$ . Thus, ${\rm flat}_{\Omega}(s^{\prime}_{n})=\delta^{2^{n+1}}$ . This forest translation is not in dFT, because $|{\rm enc}_{\Gamma}(\sigma^{n})|=|t_{n}|=2n+1$ but the height of $s^{\prime\prime}_{n}={\rm enc}_{\Omega}(\delta^{2^{n+1}})$ is $2^{n+1}$ , and so, by Lemma 6, there is no dtt that translates $t_{n}$ into $s^{\prime\prime}_{n}$ .

We will show that $\mbox{\sf dFT}\subseteq\mbox{\sf dFT}_{@}\subseteq\mbox{\sf dFT}^{2}$ . A similar result was proved for macro tree transducers in [65, Theorem 8 and Corollary 12]. To compare dFT and $\mbox{\sf dFT}_{@}$ , and their compositions, we establish two relationships between dec and flat in the next lemma.

Lemma 74

$\mbox{\sf dec}\subseteq\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}\circ\mbox{\sf flat}$ * and $\mbox{\sf flat}\subseteq\mbox{\sf dTT$ {}^{\ell}{\mathrm{su}} $}\circ\mbox{\sf dec}$ .*

**Proof. **To show the first inclusion, let $\Delta$ be an alphabet and define the mapping $h\colon T_{\Delta_{e}}\to T_{\Delta_{@}}$ such that $h(e)=e$ and if $h(t_{1})=t^{\prime}_{1}$ and $h(t_{2})=t^{\prime}_{2}$ , then $h(\delta(t_{1},t_{2}))=@(\delta(t^{\prime}_{1}),t^{\prime}_{2})$ . It is straightforward to prove that $h\circ{\rm flat}_{\Delta}={\rm dec}_{\Delta}$ . It is also easy to show that $h\in\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}$ (as in the proof of Lemma 73, $h$ is a tree homomorphism, which can be realized by a classical top-down tree transducer). Hence ${\rm dec}_{\Delta}\in\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}\circ\mbox{\sf flat}$ .

For the second inclusion, let $\Delta$ be an alphabet. The mapping ${\rm flat}_{\Delta}\circ{\rm enc}_{\Delta}$ can be realized by a local single-use dtt $M=(\Delta_{@},\Delta_{e},Q,q_{0},R)$ that performs a depth-first left-to-right tree traversal in a special way. Rather than performing this traversal in one branch, it does so in all its branches together, each branch performing a separate piece of the traversal. When $M$ arrives from above at a node $u$ with label $\delta\in\Delta$ , it outputs $\delta$ and splits into two branches. The first branch traverses the subtree at $u$ , and the second branch continues the traversal after that subtree. Each branch outputs $e$ when arriving from below at a $\Delta$ -labeled node (or at the root, at the end of the traversal). Formally, $M$ has the state set $Q=\{d,u_{1},u_{2}\}$ with initial state $q_{0}=d$ , cf. Examples 4 and 5. It has the following (general) rules, where $j^{\prime}\in[0,{\mathit{m}x}_{\Sigma}]$ , $j\in[1,{\mathit{m}x}_{\Sigma}]$ , and $\delta\in\Delta$ :

[TABLE]

Thus, since $\tau_{M}={\rm flat}_{\Delta}\circ{\rm enc}_{\Delta}$ , it follows that ${\rm flat}_{\Delta}=\tau_{M}\circ{\rm dec}_{\Delta}\in\mbox{\sf dTT$ {}^{\ell}_{\mathrm{su}} $}\circ\mbox{\sf dec}$ .

We note that the mapping ${\rm flat}_{\Delta}\circ{\rm enc}_{\Delta}$ is denoted ‘eval’ in [65, Section 4], ‘APP’ in [61], and ‘app’ in [26, Section 7]. For the reader familiar with mso translations we observe that it is also easy to show that both ${\rm flat}_{\Delta}$ and ${\rm enc}_{\Delta}$ are deterministic mso translations, and hence their composition is one. The second inclusion then follows from Proposition 29. $\Box$

It follows from the first inclusion of Lemma 74 that $\mbox{\sf dFT}\subseteq\mbox{\sf dFT}_{@}$ . In fact, $\mbox{\sf enc}\circ\mbox{\sf dTT}\circ\mbox{\sf dec}\subseteq\mbox{\sf enc}\circ\mbox{\sf dTT}\circ\mbox{\sf dTT$ {}^{\ell}{\downarrow} $}\circ\mbox{\sf flat}$ , which is included in $\mbox{\sf enc}\circ\mbox{\sf dTT}\circ\mbox{\sf flat}$ by Theorem 18. It follows from the second inclusion that $\mbox{\sf dFT}_{@}^{k}\subseteq\mbox{\sf dFT}^{k+1}$ for every $k\geq 1$ . In fact, $\mbox{\sf dFT}_{@}^{k}=(\mbox{\sf enc}\circ\mbox{\sf dTT}\circ\mbox{\sf flat})^{k}\subseteq(\mbox{\sf enc}\circ\mbox{\sf dTT}\circ\mbox{\sf dTT$ {}^{\ell}{\mathrm{su}} $}\circ\mbox{\sf dec})^{k}\subseteq\mbox{\sf enc}\circ(\mbox{\sf dTT}\circ\mbox{\sf dTT$ {}^{\ell}{\mathrm{su}} $})^{k}\circ\mbox{\sf dec}$ , which is included in $\mbox{\sf enc}\circ\mbox{\sf dTT}^{k}\circ\mbox{\sf dTT$ {}^{\ell}{\mathrm{su}} $}\circ\mbox{\sf dec}$ by Theorem 23 and hence in $\mbox{\sf enc}\circ\mbox{\sf dTT}^{k+1}\circ\mbox{\sf dec}$ , which equals $\mbox{\sf dFT}^{k+1}$ by Lemma 73.

Corollary 75

$\mbox{\sf dFT}^{k}\subseteq\mbox{\sf dFT}_{@}^{k}\subseteq\mbox{\sf dFT}^{k+1}$ * for every $k\geq 1$ .*

From the second inclusion we obtain that our main results also hold for deterministic extended tt forest transducers. It is decidable whether or not a composition of such transducers is of linear size increase, and

[TABLE]

The complexity results of Theorems 47, 49, and 50 also hold for $\mbox{\sf dFT}_{@}^{k}$ .

The class of deterministic macro forest translations of [65] can be defined as $\mbox{\sf dMFT}_{@}=\mbox{\sf enc}\circ\mbox{\sf dMT}\circ\mbox{\sf flat}$ . Since $\mbox{\sf dTT}\subseteq\mbox{\sf dMT}\subseteq\mbox{\sf dTT}^{2}$ by Lemma 24, we conclude by similar arguments as for $\mbox{\sf dFT}_{@}$ that $\mbox{\sf dMFT}_{@}^{k}\subseteq\mbox{\sf dFT}^{2k+1}$ and hence our main results also hold for deterministic macro forest transducers. It is decidable whether or not a composition of such transducers is of linear size increase, and

[TABLE]

The complexity results of Theorems 47, 49, and 50 also hold for $\mbox{\sf dMFT}_{@}^{k}$ .

The main results of Sections 9 and 11 also hold for nondeterministic forest transducers. Instead of Lemma 73 we use the obvious fact that $\mbox{\sf TT}\circ\mbox{\sf dec}\circ\mbox{\sf enc}\circ\mbox{\sf TT}\subseteq\mbox{\sf TT}\circ\mbox{\sf TT}$ .191919It can be shown that the nondeterministic version of Lemma 73 also holds, but we will not do that here.

This implies, together with Lemma 74, that it suffices to prove that the results for $\mbox{\sf TT}^{k}$ also hold for the class $\mbox{\sf enc}\circ\mbox{\sf TT}^{k}\circ\mbox{\sf dec}$ . For the nondeterministic version of Theorem 43 in Section 9, we note that a translation ${\rm enc}_{\Sigma}\circ\tau\circ{\rm dec}_{\Delta}$ is a function if and only if $\tau$ is a function. Consequently, $(\mbox{\sf enc}\circ\mbox{\sf TT}^{k}\circ\mbox{\sf dec})\cap{\cal F}\subseteq\mbox{\sf enc}\circ(\mbox{\sf TT}^{k}\cap{\cal F})\circ\mbox{\sf dec}\subseteq\mbox{\sf enc}\circ\mbox{\sf dTT}^{k+1}\circ\mbox{\sf dec}=\mbox{\sf dFT}^{k+1}$ by Theorem 35 and Lemma 73. Hence $(\mbox{\sf enc}\circ\mbox{\sf TT}^{k}\circ\mbox{\sf dec})\cap\mbox{\sf LSIF}=\mbox{\sf enc}\circ\mbox{\sf dTT$ {}_{\mathrm{su}} $}\circ\mbox{\sf dec}$ . Obviously, the complexity results of Theorems 57 and 58 in Section 11 hold for $\mbox{\sf enc}\circ\mbox{\sf TT}^{k}\circ\mbox{\sf dec}$ , with the same proof as in the deterministic case. The class of nondeterministic macro forest translations of [65] can be defined as $\mbox{\sf MFT}_{@}=\mbox{\sf enc}\circ\mbox{\sf MT}\circ\mbox{\sf flat}$ . From Lemmas 27 and 74 we obtain that $\mbox{\sf MFT}^{k}_{@}\subseteq\mbox{\sf enc}\circ\mbox{\sf TT}^{3k}\circ\mbox{\sf dec}$ , and hence all these results also hold for macro forest transducers.

We finally show that the results of Section 12 also hold for nondeterministic forest transducers. We first consider $\mbox{\sf enc}\circ\mbox{\sf TT}\circ\mbox{\sf dTT}\circ\mbox{\sf dec}$ and $\mbox{\sf enc}\circ\mbox{\sf TT}\circ\mbox{\sf dTT}\circ\mbox{\sf flat}$ . For a forest translation $\tau$ we define the forest language $L_{\tau}=\{\#[fg]\mid(f,g)\in\tau\}$ . If $\tau={\rm enc}_{\Sigma}\circ\tau^{\prime}\circ{\rm dec}_{\Delta}$ with $\tau^{\prime}\in\mbox{\sf TT}\circ\mbox{\sf dTT}$ , then $\#[fg]\in L_{\tau}$ if and only if $\#({\rm enc}_{\Sigma}(f),{\rm enc}_{\Delta}(g))\in L_{\tau^{\prime}}$ . Since ${\rm enc}_{\Sigma}(f)$ can be computed by a deterministic finite-state transducer (see footnote 18), and similarly for ${\rm enc}_{\Delta}(g)$ , $L_{\tau}$ is log-space reducible to $L_{\tau^{\prime}}$ . Hence $\mbox{\sf enc}\circ\mbox{\sf TT}\circ\mbox{\sf dTT}\circ\mbox{\sf dec}\subseteq\mbox{\sf LOGCFL}$ by Theorem 62. Similarly if $\tau^{\prime}\in\mbox{\sf dTT}$ , then $g\in\tau(L)$ if and only if ${\rm enc}_{\Delta}(g)\in\tau^{\prime}({\rm enc}_{\Sigma}(L))$ for every $L\in\mbox{\sf REGF}$ , and hence $\mbox{\sf dFT}(\mbox{\sf REGF})\subseteq\mbox{\sf LOGCFL}$ by Corollary 64. To show the same results for flat instead of dec, we need the following small lemma.

Lemma 76

$\mbox{\sf flat}\subseteq\mbox{\sf dTT$ {}_{\downarrow} $}\circ\mbox{\sf yield}$ .

**Proof. **For an alphabet $\Delta$ , let $\Omega$ be the ranked alphabet $\Delta\cup\{[\,,]\}\cup\{\lambda,@,\omega\}$ such that $\Omega^{(0)}=\Delta\cup\{[\,,],\lambda\}$ , $\Omega^{(2)}=\{@\}$ , and $\Omega^{(4)}=\{\omega\}$ . We define the deterministic tt ${}^{\ell}_{\downarrow}$ $N=(\Delta_{@},\Omega,\{p\},p,R)$ with the following (general) rules.

[TABLE]

for every $\delta\in\Delta$ . Assuming that the symbol $\lambda$ is skipped when taking yields (cf. the sentence before Corollary 65), it should be clear that ${\rm flat}_{\Delta}(t)$ is the yield of $\tau_{N}(t)$ for every $t\in T_{\Delta_{@}}$ . $\Box$

It follows from Lemma 76 and Theorem 18 that $\mbox{\sf enc}\circ\mbox{\sf TT}\circ\mbox{\sf dTT}\circ\mbox{\sf flat}\subseteq\mbox{\sf enc}\circ\mbox{\sf TT}\circ\mbox{\sf dTT}\circ\mbox{\sf yield}$ and $\mbox{\sf dFT}_{@}=\mbox{\sf enc}\circ\mbox{\sf dTT}\circ\mbox{\sf flat}\subseteq\mbox{\sf enc}\circ\mbox{\sf dTT}\circ\mbox{\sf yield}$ . If $\tau$ is a forest translation such that $\tau={\rm enc}_{\Sigma}\circ\tau^{\prime}$ with $\tau^{\prime}\in\mbox{\sf TT}\circ\mbox{\sf dTT}\circ\mbox{\sf yield}$ , then $\#[fg]\in L_{\tau}$ if and only if $\#({\rm enc}_{\Sigma}(f),g)\in L_{\tau^{\prime}}$ . Hence $\mbox{\sf enc}\circ\mbox{\sf TT}\circ\mbox{\sf dTT}\circ\mbox{\sf flat}\subseteq\mbox{\sf LOGCFL}$ by Corollary 65. Similarly if $\tau^{\prime}\in\mbox{\sf dTT}\circ\mbox{\sf yield}$ , then $\tau(L)=\tau^{\prime}({\rm enc}_{\Sigma}(L))$ for every $L\in\mbox{\sf REGF}$ , and so $\mbox{\sf dFT}_{@}(\mbox{\sf REGF})\subseteq\mbox{\sf LOGCFL}$ by Corollary 65. If we define the class of io macro forest translations to be $\mbox{\sf enc}\circ\mbox{\sf MT}_{\text{{\sc io}}}\circ\mbox{\sf flat}$ , then that class is included in $\mbox{\sf enc}\circ\mbox{\sf TT}\circ\mbox{\sf dTT}\circ\mbox{\sf flat}$ by Lemma 28 and hence in LOGCFL by the above. Thus, Corollary 63 also holds for macro forest transducers.

For a forest translation $\tau={\rm enc}_{\Sigma}\circ\tau^{\prime}\circ{\rm dec}_{\Delta}$ with $\tau^{\prime}\in\mbox{\sf TT}^{k}$ it is easy to prove that $L_{\tau}\in\mbox{\sf DSPACE}(n)$ and that $\tau(L)\in\mbox{\sf DSPACE}(n)$ for every $L\in\mbox{\sf REGF}$ , as we did above for $\tau^{\prime}\in\mbox{\sf TT}\circ\mbox{\sf dTT}$ and $\tau^{\prime}\in\mbox{\sf dTT}$ , respectively, thus generalizing Theorems 66 and 67. That also holds for ${\rm flat}_{\Delta}$ instead of ${\rm dec}_{\Delta}$ , because $\mbox{\sf enc}\circ\mbox{\sf TT}^{k}\circ\mbox{\sf flat}\subseteq\mbox{\sf enc}\circ\mbox{\sf TT}^{k+1}\circ\mbox{\sf dec}$ by Lemma 74.

The NP-completeness results of Section 12 also hold for extended forest translations. The translation $\tau$ of Theorem 71 can be changed into a translation in $\mbox{\sf enc}\circ\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf f\,TT}\circ\mbox{\sf flat}$ as follows. First, change $M_{1}$ in the proof of Theorem 71 such that it obtains as input the encodings of the strings $ab^{n}cd^{m}$ (viewed as forests). Second, change $M_{2}$ such that it outputs trees over $\Delta_{@}$ rather than $\Delta$ (by changing the rule $\langle q_{i\vee j},d\rangle\to\vee(\langle q_{i},\alpha\rangle,\langle q_{j},\alpha\rangle)$ of $M$ in the proof of Lemma 69 into the general rule $\langle q_{i\vee j},d\rangle\to\vee(@(\langle q_{i},\alpha\rangle,\langle q_{j},\alpha\rangle))$ , and similarly for $\wedge$ ). As a result $\tau$ outputs boolean expressions as forests rather than ranked trees. Thus we obtain an NP-complete extended forest translation in $\mbox{\sf enc}\circ\mbox{\sf dTT$ {}{\downarrow} $}\circ\mbox{\sf f\,TT}\circ\mbox{\sf flat}$ , and hence one in $\mbox{\sf MFT}_{@}$ . In a similar way we also obtain an NP-complete forest language in $\mbox{\sf FT}_{@}(\mbox{\sf REGF})$ . The details are left to the reader. It is not clear whether these results hold for dec instead of flat.

14 Conclusion

Our main technical result transforms a composition of $k$ tt’s into a linear-bounded composition of $k$ tt’s, cf. Corollary 38. As observed in Remark 41, our proof of this result can involve a $2(k-1)$ -fold blow-up of the sizes of the transducers, which also influences the constants of their time and space complexities, cf. the sentence after Theorem 47. We do not know whether this transformation can be realized in a more efficient way.

Our main result on expressivity is that $\mbox{\sf dTT}^{k}\cap\mbox{\sf LSIF}\subseteq\mbox{\sf dTT}$ for every $k\geq 1$ , i.e., that every composition of dtt’s that is of linear size increase can be realized by one dtt. Moreover, it is decidable whether or not such a composition is of linear size increase. Do similar results hold for polynomial size increase? For instance, does there exist $m\geq 1$ such that every translation in $\bigcup_{k\geq 1}\mbox{\sf dTT}^{k}$ of quadratic size increase is in $\mbox{\sf dTT}^{m}$ ? The same question can be asked for $\ell$ -fold exponential size increase, for each fixed $\ell\in{\mathbb{N}}$ .

We have shown in Section 7 that even $\mbox{\sf TT}^{k}\cap\mbox{\sf LSIF}\subseteq\mbox{\sf dTT}$ for every $k\geq 1$ , generalizing Theorem 43. Although this result is effective, we do not know whether Theorem 44 can also be generalized, i.e., whether it is decidable for a nondeterministic tt $M$ whether or not $\tau_{M}$ is a function of linear size increase. This would be solved if it was decidable whether or not $\tau_{M}$ is a function. But that is also unknown, whereas it has been proved for classical top-down tree transducers (with regular look-ahead) in [36, Theorem 8]. Note that deciding functionality of $\tau_{M}$ also solves the equivalence problem for dtt’s, which is already a long standing open problem (cf. [22, 60]); in fact, $\tau_{1},\tau_{2}\in\mbox{\sf dTT}$ are the same if and only if they have the same domain and $\tau_{1}\cup\tau_{2}$ is functional.

Another open question for nondeterministic tt’s is whether or not there exists $m\geq 1$ such that the inclusion $\mbox{\sf TT}^{k}\cap\mbox{\sf LSIR}\subseteq\mbox{\sf TT}^{m}$ holds for every $k\geq 1$ , where LSIR consists of all relations $\tau\subseteq T_{\Sigma}\times T_{\Delta}$ of linear size increase, which means that there is a constant $c\in{\mathbb{N}}$ such that $|s|\leq c\cdot|t|$ for every $(t,s)\in\tau$ . It follows from (the proof of) [48, Theorem 3.21] (see also [49, 50]) that $\mbox{\sf TT}^{2}\cap\mbox{\sf LSIR}$ is not included in MT, and hence not in TT by the remark following Lemma 27.

Similar questions can be asked for macro tree transducers, i.e., for the classes dMT and MT.

We have shown in Lemma 12 that $\mbox{\sf dTT$ {}{\downarrow} $}=\mbox{\sf dTT$ {}^{\hskip 1.13791pt\mathrm{s}}{\downarrow} $}$ , but we do not know whether or not $\mbox{\sf dTT}=\mbox{\sf dTT$ {}^{\hskip 1.13791pt\mathrm{s}} $}$ . In other words, we do not know whether for every tt there is an equivalent sub-testing tt, in which the regular test of a rule only inspects the subtree of the current node. Or even more informally, can regular look-around be simulated by regular look-ahead?

We have shown in Corollary 59 that the string languages in the oi-hierarchy, which are generated by high-level grammars, are in $\mbox{\sf NSPACE}(n)\wedge\mbox{\sf NPTIME}$ , and in Corollary 68 that they are in $\mbox{\sf DSPACE}(n)$ . However, the languages of the oi-hierarchy are generated by so-called “safe” high-level grammars, and it is not known whether the same results hold for unsafe high-level grammars. It is proved in [54] that the languages generated by unsafe level-2 grammars, the unsafe version of $\mbox{\sf OI}(2)$ , are in $\mbox{\sf NSPACE}(n)$ .

In Section 12 we have shown that $\mbox{\sf dTT}^{k}\subseteq\mbox{\sf PTIME}$ , that $\mbox{\sf TT}\circ\mbox{\sf dTT}\subseteq\mbox{\sf LOGCFL}\subseteq\mbox{\sf PTIME}$ , and that $\mbox{\sf dTT}\circ\mbox{\sf TT}$ contains an NP-complete translation. It remains to find out for $k\geq 2$ whether $\mbox{\sf TT}\circ\mbox{\sf dTT}^{k}\subseteq\mbox{\sf PTIME}$ or whether it contains an NP-complete translation.

Acknowledgements. We are grateful to the reviewers for their constructive comments.

Bibliography75

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Aho AV (1968) Indexed grammars - an extension of context-free grammars. Journal of the ACM 15: 647–671
2[2] Aho AV, Ullman JD (1971) Translations on a context-free grammar. Information and Control 19: 439–475
3[3] Asveld PRJ (1981) Time and space complexity of inside-out macro languages. International Journal of Computer Mathematics 10: 3–14
4[4] Baker BS (1978) Generalized syntax-directed translation, tree transducers, and linear space. SIAM Journal on Computing 7: 376–391
5[5] Bartha M (1982) An algebraic definition of attributed transformations. Acta Cybernetica 5: 409–421
6[6] Bloem R, Engelfriet J (1997) Monadic second order logic and node relations on graphs and trees. In: Mycielski J, Rozenberg G, Salomaa A (eds) Structures in Logic and Computer Science . Lecture Notes in Computer Science 1261, Springer-Verlag, pp 144–161. A corrected version is available at https://www.researchgate.net/publication/221350026
7[7] Bloem R, Engelfriet J (2000) A comparison of tree translations defined by monadic second order logic and by attribute grammars. Journal of Computer and System Sciences 61: 1–50
8[8] Bogaert B, Tison S (1992) Equality and disequality constraints on direct subterms in tree automata. In: Finkel A, Jantzen M (eds) Proc. STACS’92, Lecture Notes in Computer Science 577, Springer-Verlag, pp 161–171

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Linear-Bounded Composition of

Abstract

Contents

1 Introduction

2 Preliminaries

Lemma 1

Lemma 2

Lemma 3

3 Tree-Walking Tree Transducers

Example 4

Example 5

Lemma 6

Proposition 7

Lemma 8

4 Regular Look-Around

Lemma 9

Lemma 10

Lemma 11

Lemma 12

Lemma 13

Corollary 14

Lemma 15

Lemma 16

5 Composition

Lemma 17

Theorem 18

Lemma 19

Theorem 20

Corollary 21

Lemma 22

Theorem 23

6 Macro and MSO

6.1 Macro Tree Transducers

Lemma 24

Corollary 25

Lemma 26

Lemma 27

Lemma 28

6.2 MSO Tree Transducers

Proposition 29

Proposition 30

Lemma 31

Corollary 32

Corollary 33

7 Functional Nondeterminism

Lemma 34

Theorem 35

Corollary 36

8 Productivity

Theorem 37

Corollary 38

8.1 Nondeterministic Productivity

Lemma 39

Lemma 40

Remark 41

8.2 Deterministic Productivity

Lemma 42

9 Linear Size Increase

Theorem 43

Theorem 44

Corollary 45

Corollary 46

10 Deterministic Complexity

Theorem 47

Lemma 48

Theorem 49

Theorem 50

Corollary 51

Corollary 52

Corollary 53

11 Nondeterministic Complexity

Lemma 54

Lemma 55

Lemma 56