Riemann Tensor Polynomial Canonicalization by Graph Algebra Extension

Hongbo Li; Zhang Li; Yang Li

arXiv:1701.08487·cs.SC·January 31, 2017

Riemann Tensor Polynomial Canonicalization by Graph Algebra Extension

Hongbo Li, Zhang Li, Yang Li

PDF

Open Access

TL;DR

This paper introduces a novel graph algebra extension theory to develop a canonicalization algorithm for Riemann tensor polynomials, addressing the longstanding challenge of multiterm tensor expression simplification.

Contribution

It presents the first extension theory of graph algebra specifically designed for multiterm Riemann tensor polynomial canonicalization.

Findings

01

Developed a new canonicalization algorithm based on graph algebra extension

02

Achieved improved efficiency in tensor polynomial simplification

03

Provided theoretical foundations for multiterm tensor canonicalization

Abstract

Tensor expression simplification is an "ancient" topic in computer algebra, a representative of which is the canonicalization of Riemann tensor polynomials. Practically fast algorithms exist for monoterm canonicalization, but not for multiterm canonicalization. Targeting the multiterm difficulty, in this paper we establish the extension theory of graph algebra, and propose a canonicalization algorithm for Riemann tensor polynomials based on this theory.

Equations48

(λ_{1} D_{1} + \dots + λ_{k - 1} D_{k - 1}) (λ_{k} D_{k}) := (λ_{1} λ_{k}) (D_{1} D_{k}) + \dots + (λ_{k - 1} λ_{k}) (D_{k - 1} D_{k}) .

(λ_{1} D_{1} + \dots + λ_{k - 1} D_{k - 1}) (λ_{k} D_{k}) := (λ_{1} λ_{k}) (D_{1} D_{k}) + \dots + (λ_{k - 1} λ_{k}) (D_{k - 1} D_{k}) .

{v (S_{σ (1)} S_{σ (2)}, S_{σ (3)} S_{σ (4)}) ∣ σ \in S_{4}}

{v (S_{σ (1)} S_{σ (2)}, S_{σ (3)} S_{σ (4)}) ∣ σ \in S_{4}}

E x t (v (S_{1} S_{2}, S_{3} S_{4})) := σ \in S_{4} \sum λ_{σ} v (S_{σ (1)} S_{σ (2)}, S_{σ (3)} S_{σ (4)}),

E x t (v (S_{1} S_{2}, S_{3} S_{4})) := σ \in S_{4} \sum λ_{σ} v (S_{σ (1)} S_{σ (2)}, S_{σ (3)} S_{σ (4)}),

D = i = 1 \prod n v_{i} (S_{i 1} S_{i 2}, S_{i 3} S_{i 4}),

D = i = 1 \prod n v_{i} (S_{i 1} S_{i 2}, S_{i 3} S_{i 4}),

\begin{array}[]{lrl}Ext(D)&:=&\displaystyle\prod_{i=1}^{n}\Big{(}\sum_{\sigma_{i}\in S_{4}}(\lambda_{\sigma_{i}}v_{i}(S_{\sigma_{i}(1)}S_{\sigma_{i}(2)},S_{\sigma_{i}(3)}S_{\sigma_{i}(4)}))\Big{)}\\ &=&\displaystyle\sum_{(\sigma_{1},\ldots,\sigma_{n})\in(S_{4})^{n}}\hskip-3.41418pt\Big{(}\prod_{i=1}^{n}\lambda_{\sigma_{i}}\Big{)}\Big{(}\prod_{i=1}^{n}v_{i}(S_{\sigma_{i}(1)}S_{\sigma_{i}(2)},S_{\sigma_{i}(3)}S_{\sigma_{i}(4)})\Big{)},\end{array}

\begin{array}[]{lrl}Ext(D)&:=&\displaystyle\prod_{i=1}^{n}\Big{(}\sum_{\sigma_{i}\in S_{4}}(\lambda_{\sigma_{i}}v_{i}(S_{\sigma_{i}(1)}S_{\sigma_{i}(2)},S_{\sigma_{i}(3)}S_{\sigma_{i}(4)}))\Big{)}\\ &=&\displaystyle\sum_{(\sigma_{1},\ldots,\sigma_{n})\in(S_{4})^{n}}\hskip-3.41418pt\Big{(}\prod_{i=1}^{n}\lambda_{\sigma_{i}}\Big{)}\Big{(}\prod_{i=1}^{n}v_{i}(S_{\sigma_{i}(1)}S_{\sigma_{i}(2)},S_{\sigma_{i}(3)}S_{\sigma_{i}(4)})\Big{)},\end{array}

v (S_{1} S_{2}, S_{3} S_{4}), v (S_{1} S_{3}, S_{4} S_{2}), v (S_{1} S_{4}, S_{2} S_{3}) .

v (S_{1} S_{2}, S_{3} S_{4}), v (S_{1} S_{3}, S_{4} S_{2}), v (S_{1} S_{4}, S_{2} S_{3}) .

\begin{array}[]{l}D=\prod_{i=1}^{n}v_{i}(S_{i1}S_{i2},S_{i3}S_{i4}),\ \hbox{ where }\\ \hbox{the }v_{i}\hbox{ for }i\leq r\hbox{ have loop, and for }i>r\hbox{ are loop-free},\hbox{\vrule height=12.5pt,depth=5.0pt,width=0.0pt}\end{array}

\begin{array}[]{l}D=\prod_{i=1}^{n}v_{i}(S_{i1}S_{i2},S_{i3}S_{i4}),\ \hbox{ where }\\ \hbox{the }v_{i}\hbox{ for }i\leq r\hbox{ have loop, and for }i>r\hbox{ are loop-free},\hbox{\vrule height=12.5pt,depth=5.0pt,width=0.0pt}\end{array}

\begin{array}[]{lll}Ext(D)&=&\displaystyle\Big{(}\lambda_{r}\prod_{i=1}^{r}v_{i}(S_{i1}S_{i2},S_{i3}S_{i4})\Big{)}\Big{(}\prod_{i=r+1}^{n}(\lambda_{i2}v_{i}(S_{i1}S_{i2},S_{i3}S_{i4})+\lambda_{i3}v_{i}(S_{i1}S_{i3},S_{i4}S_{i2})+\lambda_{i4}v_{i}(S_{i1}S_{i4},S_{i2}S_{i3}))\Big{)}\\ &=&\displaystyle\sum_{i=r+1}^{n}\ \sum_{\sigma_{i}\in BP(1,2)}\Big{(}\lambda_{r}\prod_{i=r+1}^{n}{\rm sign}(\sigma_{i})\lambda_{i\sigma_{i}(2)}\Big{)}\\ &&\displaystyle\Big{(}\prod_{i=1}^{r}v_{i}(S_{i1}S_{i2},S_{i3}S_{i4})\Big{)}\Big{(}\prod_{i=r+1}^{n}v_{i}(S_{i1}S_{i\sigma_{i}(2)},S_{i\sigma_{i}(3)}S_{i\sigma_{i}(4)})\Big{)},\end{array}

\begin{array}[]{lll}Ext(D)&=&\displaystyle\Big{(}\lambda_{r}\prod_{i=1}^{r}v_{i}(S_{i1}S_{i2},S_{i3}S_{i4})\Big{)}\Big{(}\prod_{i=r+1}^{n}(\lambda_{i2}v_{i}(S_{i1}S_{i2},S_{i3}S_{i4})+\lambda_{i3}v_{i}(S_{i1}S_{i3},S_{i4}S_{i2})+\lambda_{i4}v_{i}(S_{i1}S_{i4},S_{i2}S_{i3}))\Big{)}\\ &=&\displaystyle\sum_{i=r+1}^{n}\ \sum_{\sigma_{i}\in BP(1,2)}\Big{(}\lambda_{r}\prod_{i=r+1}^{n}{\rm sign}(\sigma_{i})\lambda_{i\sigma_{i}(2)}\Big{)}\\ &&\displaystyle\Big{(}\prod_{i=1}^{r}v_{i}(S_{i1}S_{i2},S_{i3}S_{i4})\Big{)}\Big{(}\prod_{i=r+1}^{n}v_{i}(S_{i1}S_{i\sigma_{i}(2)},S_{i\sigma_{i}(3)}S_{i\sigma_{i}(4)})\Big{)},\end{array}

v (S_{σ (1)} S_{σ (2)}, S_{σ (3)} S_{σ (4)}) + v (S_{σ (1)} S_{σ (3)}, S_{σ (4)} S_{σ (2)}) + v (S_{σ (1)} S_{σ (4)}, S_{σ (2)} S_{σ (3)}) = 0,

v (S_{σ (1)} S_{σ (2)}, S_{σ (3)} S_{σ (4)}) + v (S_{σ (1)} S_{σ (3)}, S_{σ (4)} S_{σ (2)}) + v (S_{σ (1)} S_{σ (4)}, S_{σ (2)} S_{σ (3)}) = 0,

\Big{\{}\Big{(}\prod_{i=1}^{r}v_{i}(S_{i1}S_{i2},S_{i3}S_{i4})\Big{)}\Big{(}\prod_{i=r+1}^{n}v_{i}(S_{i1}S_{i\sigma_{i}(2)},S_{i\sigma_{i}(3)}S_{i4})\Big{)}\,\Big{|}\,\sigma_{i}\in S_{2}\hbox{ acting upon }2,3\Big{\}}.

\Big{\{}\Big{(}\prod_{i=1}^{r}v_{i}(S_{i1}S_{i2},S_{i3}S_{i4})\Big{)}\Big{(}\prod_{i=r+1}^{n}v_{i}(S_{i1}S_{i\sigma_{i}(2)},S_{i\sigma_{i}(3)}S_{i4})\Big{)}\,\Big{|}\,\sigma_{i}\in S_{2}\hbox{ acting upon }2,3\Big{\}}.

\begin{array}[]{l}\displaystyle{pre\hbox{-}normal}\Big{\{}\Big{(}\prod_{s\leq r}v_{s}(S_{s1}S_{s2},S_{s3}S_{s4})\Big{)}\Big{(}\prod_{r<j<i}v_{j}(S_{j1}S_{j\sigma_{j}(2)},S_{j\sigma_{j}(3)}S_{j4})\Big{)}\hbox{\vrule height=17.5pt,depth=5.0pt,width=0.0pt}\\ \displaystyle\phantom{{pre\hbox{-}normal}\Big{\{}}\Big{(}v_{i}(S_{i1}S_{i2},S_{i3}S_{i4})+v_{i}(S_{i1}S_{i3},S_{i4}S_{i2})+v_{i}(S_{i1}S_{i4},S_{i2}S_{i3})\hbox{\vrule height=17.5pt,depth=5.0pt,width=0.0pt}\Big{)}\\ \displaystyle\phantom{{pre\hbox{-}normal}\Big{\{}}\Big{(}\prod_{k>i}v_{k}(S_{k1}S_{k\sigma_{k}(2)},S_{k\sigma_{k}(3)}S_{k\sigma_{k}(4)})\Big{)}\hbox{\vrule height=17.5pt,depth=5.0pt,width=0.0pt}\Big{\}}\ =0.\end{array}

\begin{array}[]{l}\displaystyle{pre\hbox{-}normal}\Big{\{}\Big{(}\prod_{s\leq r}v_{s}(S_{s1}S_{s2},S_{s3}S_{s4})\Big{)}\Big{(}\prod_{r<j<i}v_{j}(S_{j1}S_{j\sigma_{j}(2)},S_{j\sigma_{j}(3)}S_{j4})\Big{)}\hbox{\vrule height=17.5pt,depth=5.0pt,width=0.0pt}\\ \displaystyle\phantom{{pre\hbox{-}normal}\Big{\{}}\Big{(}v_{i}(S_{i1}S_{i2},S_{i3}S_{i4})+v_{i}(S_{i1}S_{i3},S_{i4}S_{i2})+v_{i}(S_{i1}S_{i4},S_{i2}S_{i3})\hbox{\vrule height=17.5pt,depth=5.0pt,width=0.0pt}\Big{)}\\ \displaystyle\phantom{{pre\hbox{-}normal}\Big{\{}}\Big{(}\prod_{k>i}v_{k}(S_{k1}S_{k\sigma_{k}(2)},S_{k\sigma_{k}(3)}S_{k\sigma_{k}(4)})\Big{)}\hbox{\vrule height=17.5pt,depth=5.0pt,width=0.0pt}\Big{\}}\ =0.\end{array}

free indices ≺ new dummy indices (integers) ≺ input dummy indices.

free indices ≺ new dummy indices (integers) ≺ input dummy indices.

d_{2} \to 1, d_{1} d_{5} \to 23, d_{4} \to 4, d_{3} \to 5.

d_{2} \to 1, d_{1} d_{5} \to 23, d_{4} \to 4, d_{3} \to 5.

\begin{array}[]{llll}(1)&d_{5}\rightarrow 1,&d_{3}\rightarrow 2,&d_{4}\rightarrow 3;\\ (2)&d_{5}\rightarrow 1,&d_{3}\rightarrow 3,&d_{4}\rightarrow 2;\\ (3)&d_{6}\rightarrow 1,&d_{1}\rightarrow 2,&d_{2}\rightarrow 3;\\ (4)&d_{6}\rightarrow 1,&d_{1}\rightarrow 3,&d_{2}\rightarrow 2.\end{array}

\begin{array}[]{llll}(1)&d_{5}\rightarrow 1,&d_{3}\rightarrow 2,&d_{4}\rightarrow 3;\\ (2)&d_{5}\rightarrow 1,&d_{3}\rightarrow 3,&d_{4}\rightarrow 2;\\ (3)&d_{6}\rightarrow 1,&d_{1}\rightarrow 2,&d_{2}\rightarrow 3;\\ (4)&d_{6}\rightarrow 1,&d_{1}\rightarrow 3,&d_{2}\rightarrow 2.\end{array}

\begin{array}[]{llll}(1)&Q_{F}=[\phantom{-}R(12,13),\phantom{-}R(23,d_{1}d_{2})],&Q_{D}=\{R(d_{1}d_{6},d_{2}d_{6})\};\\ (2)&Q_{F}=[-R(12,13),\phantom{-}R(23,d_{1}d_{2})],&Q_{D}=\{R(d_{1}d_{6},d_{2}d_{6})\};\\ (3)&Q_{F}=[\phantom{-}R(12,13),-R(23,34)],&Q_{D}=\{R(d_{3}d_{5},d_{4}d_{5})\};\\ (4)&Q_{F}=[\phantom{-}R(12,13),\phantom{-}R(23,34)],&Q_{D}=\{R(d_{3}d_{5},d_{4}d_{5})\}.\end{array}

\begin{array}[]{llll}(1)&Q_{F}=[\phantom{-}R(12,13),\phantom{-}R(23,d_{1}d_{2})],&Q_{D}=\{R(d_{1}d_{6},d_{2}d_{6})\};\\ (2)&Q_{F}=[-R(12,13),\phantom{-}R(23,d_{1}d_{2})],&Q_{D}=\{R(d_{1}d_{6},d_{2}d_{6})\};\\ (3)&Q_{F}=[\phantom{-}R(12,13),-R(23,34)],&Q_{D}=\{R(d_{3}d_{5},d_{4}d_{5})\};\\ (4)&Q_{F}=[\phantom{-}R(12,13),\phantom{-}R(23,34)],&Q_{D}=\{R(d_{3}d_{5},d_{4}d_{5})\}.\end{array}

\begin{array}[]{ll}&{pre\hbox{-}normal}(Ext(f))\\ =&{pre\hbox{-}normal}\Big{(}\lambda_{0}(\lambda_{12}R(12,34)+\lambda_{13}R(13,42)+\lambda_{14}R(14,23))\hbox{\vrule height=17.5pt,depth=5.0pt,width=0.0pt}\\ &\phantom{{pre\hbox{-}normal}\Big{(}\lambda_{0}}(\lambda_{22}R(13,24)+\lambda_{23}R(12,43)+\lambda_{24}R(14,32))\Big{)}\\ =&\lambda_{0}(\lambda_{12}\lambda_{22}+\lambda_{12}\lambda_{24}+\lambda_{13}\lambda_{23}+\lambda_{13}\lambda_{24}+\lambda_{14}\lambda_{22}+\lambda_{14}\lambda_{23})R(12,34)R(13,24)\hbox{\vrule height=17.5pt,depth=5.0pt,width=0.0pt}\\ &\hfill-\lambda_{0}(\lambda_{12}\lambda_{23}+\lambda_{13}\lambda_{22}+\lambda_{14}\lambda_{24})R(12,34)R(12,34).\end{array}

\begin{array}[]{ll}&{pre\hbox{-}normal}(Ext(f))\\ =&{pre\hbox{-}normal}\Big{(}\lambda_{0}(\lambda_{12}R(12,34)+\lambda_{13}R(13,42)+\lambda_{14}R(14,23))\hbox{\vrule height=17.5pt,depth=5.0pt,width=0.0pt}\\ &\phantom{{pre\hbox{-}normal}\Big{(}\lambda_{0}}(\lambda_{22}R(13,24)+\lambda_{23}R(12,43)+\lambda_{24}R(14,32))\Big{)}\\ =&\lambda_{0}(\lambda_{12}\lambda_{22}+\lambda_{12}\lambda_{24}+\lambda_{13}\lambda_{23}+\lambda_{13}\lambda_{24}+\lambda_{14}\lambda_{22}+\lambda_{14}\lambda_{23})R(12,34)R(13,24)\hbox{\vrule height=17.5pt,depth=5.0pt,width=0.0pt}\\ &\hfill-\lambda_{0}(\lambda_{12}\lambda_{23}+\lambda_{13}\lambda_{22}+\lambda_{14}\lambda_{24})R(12,34)R(12,34).\end{array}

2 (λ_{22} + λ_{23} + λ_{24}) x_{2} - (λ_{22} + λ_{23} + λ_{24}) x_{1} = 0.

2 (λ_{22} + λ_{23} + λ_{24}) x_{2} - (λ_{22} + λ_{23} + λ_{24}) x_{1} = 0.

2 (λ_{12} + λ_{13} + λ_{14}) x_{2} - (λ_{12} + λ_{13} + λ_{14}) x_{1} = 0.

2 (λ_{12} + λ_{13} + λ_{14}) x_{2} - (λ_{12} + λ_{13} + λ_{14}) x_{1} = 0.

A x = 0,

A x = 0,

sub_{\check{\mathbb{Q}}}({Rre\hskip-1.13791ptf}_{N-2})\subseteq\Big{\langle}sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{N-1}),sub_{\check{\mathbb{Q}}}({Rre\hskip-1.13791ptf}_{N-1})\Big{\rangle}_{{\mathbb{Q}}(\Lambda)}\subseteq\Big{\langle}sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{N-1}),sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{N})\Big{\rangle}_{{\mathbb{Q}}(\Lambda)}

sub_{\check{\mathbb{Q}}}({Rre\hskip-1.13791ptf}_{N-2})\subseteq\Big{\langle}sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{N-1}),sub_{\check{\mathbb{Q}}}({Rre\hskip-1.13791ptf}_{N-1})\Big{\rangle}_{{\mathbb{Q}}(\Lambda)}\subseteq\Big{\langle}sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{N-1}),sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{N})\Big{\rangle}_{{\mathbb{Q}}(\Lambda)}

\hbox{all rows of }{\bf A}\subseteq\Big{\langle}sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{1}),sub_{\check{\mathbb{Q}}}({Rre\hskip-1.13791ptf}_{1})\Big{\rangle}_{{\mathbb{Q}}(\Lambda)}\subseteq\ldots\subseteq\displaystyle\Big{\langle}\bigcup_{i=1}^{N}sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{i})\Big{\rangle}_{{\mathbb{Q}}(\Lambda)}.

\hbox{all rows of }{\bf A}\subseteq\Big{\langle}sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{1}),sub_{\check{\mathbb{Q}}}({Rre\hskip-1.13791ptf}_{1})\Big{\rangle}_{{\mathbb{Q}}(\Lambda)}\subseteq\ldots\subseteq\displaystyle\Big{\langle}\bigcup_{i=1}^{N}sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{i})\Big{\rangle}_{{\mathbb{Q}}(\Lambda)}.

k \in J \sum (γ_{k} + α_{k} + \frac{β _{k}}{h}) x_{k} = 0,

k \in J \sum (γ_{k} + α_{k} + \frac{β _{k}}{h}) x_{k} = 0,

k \in J \sum (γ_{k} + α_{k} ∣_{i} + \frac{β _{k} ∣ _{i}}{h ∣ _{i}}) x_{k} = 0,

k \in J \sum (γ_{k} + α_{k} ∣_{i} + \frac{β _{k} ∣ _{i}}{h ∣ _{i}}) x_{k} = 0,

i = 0 \sum [m / n] n^{2} (m - in) = n^{2} m ([m / n] + 1) - n^{3} [m / n] ([m / n] + 1) /2 = O (mn (m + n)),

i = 0 \sum [m / n] n^{2} (m - in) = n^{2} m ([m / n] + 1) - n^{3} [m / n] ([m / n] + 1) /2 = O (mn (m + n)),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Tensor decomposition and applications · Distributed and Parallel Computing Systems

Full text

11institutetext: Key Laboratory of Mathematics Mechanization, AMSS, UCAS, NCMIS, Chinese Academy of Sciences, Beijing 100190, China

Riemann Tensor Polynomial Canonicalization by Graph Algebra Extension

Hongbo Li

Zhang Li

Yang Li

[email protected]

Abstract

Tensor expression simplification is an “ancient” topic in computer algebra, a representative of which is the canonicalization of Riemann tensor polynomials. Practically fast algorithms exist for monoterm canonicalization, but not for multiterm canonicalization. Targeting the multiterm difficulty, in this paper we establish the extension theory of graph algebra, and propose a canonicalization algorithm for Riemann tensor polynomials based on this theory.

Key words: Tensor expression simplification; Riemann tensor polynomial; Multiterm canonicalization; Graph algebra; Multigraph extension.

1 Introduction

Tensor canonicalization is a classical topic in computer algebra. There is a myriad of softwares including this function, some of which are updating it consistently. An extensive collection of the softwares and related literature can be found in [13].

The symmetries of the Riemann tensor on a Riemannian manifold of unknown dimension are one of the most complex that occur in practice, making it a challenging task to normalize general Riemann tensor polynomials. We adopt the following terminology in this paper:

R-factor:

indicial form of the Riemann tensor. It is of the form $R(ab,cd)$ , where $a,b,c,d$ are indices.

R-monomial:

scaled contraction of $n$ copies of the Riemann tensor; $n$ is called the degree of the R-monomial.

Ricci R-factor:

an R-factor with loop (two dummy indices of the same name).

R-polynomial:

a tensor whose indicial form is the sum of R-monomials.

Row character:

the 4 indices of an R-factor can be written in two rows: the upper and lower ones.

The following are the symmetries within an R-monomial of degree $n$ :

•

$Sym8$ -symmetry: the group ${\mathbb{Z}}_{2}\times{\mathbb{Z}}_{2}\times{\mathbb{Z}}_{2}$ upon an R-factor; it has two generators: $R(ab,cd)=-R(ba,cd)$ and $R(ab,cd)=R(cd,ab)$ .

•

Commutativity: an R-monomial of degree $n$ has permutation symmetry $S_{n}$ among its R-factors.

•

Renaming: an R-monomial is invariant under renaming of dummy indices; it has permutation symmetry $S_{d}$ , where $d$ is the number of different dummy indices.

•

Cyclic symmetry: Bianchi identity $R(ab,cd)+R(ac,db)+R(ad,bc)=0.$

“ $Sym8$ ” and “Commutativity” form a group of size $8^{n}n!$ , while “Renaming” forms another group of size $d!$ . The two groups are called the monoterm symmetry, while the cyclic symmetry is called the multiterm symmetry.

There is a symmetry on row character. Two indices of the same name can always invert their row characters. This row symmetry is generally not taken into account, because a free index in any term of an R-polynomial must have the same row character, a property determined by the tensor nature.

Suppose an order is given among the indices. Given an R-monomial $f$ , in the equivalence class defined by the monoterm symmetry, an R-monomial whose sequence of indices has the minimal lexicographic order is called the pre-normal form of $f$ , also known as the canonical form of $f$ under the monoterm symmetry. The canonical form of $f$ under all the symmetries of $f$ is called its normal form [2].

We first introduce several methods in the literature devoted solely to computing the pre-normal form.

Renaming preference:

When all the permutations of dummy indices inside an R-monomial are listed, then sorting the R-factors has logarithm complexity for any of the permutations. Any method of this class has factorial complexity [8].

Double coset representative:

In [9], [10], a method based on strong generating set representation of the permutation group is proposed, which has exponential complexity in the worst case [9]. The softwares xPerm and Canon based on this algorithm turn out to be the fastest in practice [12].

Graph-theoretic method based on directed graph labeling:

A pair of dummy indices of a same name can be naturally taken as an edge connecting two vertices. To use the canonical form in graph theory, a formulation of tensor algebra as an algebra of graphs is proposed in [15]. In this thesis for Master’s Degree, the first canonicalization algorithm based on graph theory is proposed, where a tensor is formulated as a graph by representing (1) its name as a vertex, (2) the indices as edges, (3) the row character of an index as direction of the edge, (4) the position of an index by showing it on the edge. In canonical relabeling, the adjacency matrix is used to find the best permutation for identical vertices in a graph. The algorithm should have factorial complexity.

Graph-theoretic method based on undirected graph labeling:

A different graph representation of tensors is proposed in [6], where (1) each index is a vertex, (2) each pair of identical dummy indices is an edge, (3) an extra vertex is constructed to connect with every free index, (4) labels are used to describe the row character and position of an index. A fast algorithm for graph isomorphism problem is used to find the canonical graph having the smallest sorted labeled edge set. The algorithm has factorial complexity in the worst case.

The following methods are oriented to computing the normal form instead of the pre-normal form only.

Group algebra and group representation theory:

In [3], [4], the group algebra of $Sym8\times S_{n}$ is used to reduce computing the normal form to computing the orthogonal rejection from a linear subspace in an surrounding space of dimension $8^{n}n!$ . In [2], Young diagram and Schur program are used in normalization, and are later implemented in Cadabra [17]. These methods all have factorial complexity. In [5], a nondeterministic method based on genetic algorithms is proposed.

Gröbner basis:

In [18], it is proposed that computing the normal form of an R-monomial can be separated into two stages: first, computing the pre-normal form; second, using the Gröbner basis method to deal with multiterm symmetry. To use the Gröbner basis method, all the permutations of dummy indices inside an R-monomial must be considered. As the tensor contraction is highly restricted when compared with polynomial multiplication, a Gröbner basis theory need be established properly for tensor contraction [8]. The methods have factorial complexity.

Linear equation solving among pre-normal forms related to cyclic symmetry.

[11] introduces a method of normalization by solving large systems of linear equations derived by alternatively replacing every R-factor in the pre-normal form of an R-monomial with its 3-index antisymmetrization in all possible inequivalent ways. The method has exponential complexity.

Term rewriting by graph structure analysis:

In [7], a term rewriting method is proposed to normalize a special class of R-monomials of degree 3 by analyzing the structure of the graph associated with each R-monomial.

As summarized in [13], there are fast monoterm canonicalization algorithms, while efficient multiterm canonicalization algorithms are still missing. In some softwares a database storing the hard-to-compute normal forms of a large number of R-monomials is set up for fast visit. For example, MathTensor [16] has an ever-growing list of RiemannRule’s; Invar stores one million random Riemann monomial scalars and their normal forms of degree ranging from 2 up to 100 [11], [12], [14].

Targeting the difficulty in manipulating multiterm symmetry, in this paper:

i. We establish the extension theory of graph algebra.

The graph algebra proposed in [15] cannot handle multiterm symmetry. We first propose a new graph algebra by representing (1) an index as a vertex, with the position in the inclusive R-factor as an intrinsic property of the vertex, (2) a pair of identical indices as an edge. Then we extend Grassmann’s extension theory to the graph algebra, where in an extension graph a vertex is a high-dimensional vector space spanned by several vertices of the original graph. This framework makes it natural to handle cyclic symmetry and more general multiterm symmetry, where algebraic manipulations are used in stead of graph-theoretic algorithms.

ii. We propose a easy-to-understand algorithm for pre-normal form computing, which has the same worst-case complexity with the fastest algorithm [12] in the literature.

The following observation is fundamental: when a configuration of R-factors is fixed, then to get the minimal index sequence one only need fill in new dummy indices by the occurrence order of the positions held by old dummy indices. It is called the “positional order” in [1], and has been implemented in software TTC.

Our algorithm has no prerequisite of group theory. Based on the above observation, to fix a configuration our algorithm does not need run through the whole permutation group $S_{n}$ . It has worst-case complexity $O(n^{2}2^{n})$ .

iii. We propose a complete algorithm for normal form computing based on graph algebra extension and linear equation solving in rational functions field.

There are two steps. First the pre-normal form of the extension graph is computed. Then the normal form is computed by computing the rational numbers valued RREF (reduced row echelon form) of a linear system with rational functions coefficient. The complexity of the algorithm is $O(n9^{n})$ by Gauss-Jordan elimination. In contrast, the “brute-force” linear equation solving method [11] has complexity $O(27^{n})$ if by Gauss-Jordan elimination.

Throughout this paper we set the base numbers field to be the rational numbers $\mathbb{Q}$ , and set the order among monomials/sequences to be the lexicographic order. This paper is organized as follows. Sections 2 and 3 are on graph algebra extension theory; Sections 4 and 5 are on algorithms for pre-normal form and normal form respectively; Experiments and more examples will be reported elsewhere.

2 Connection multigraph and detailed graph

A connection multigraph is an undirected multigraph where the degree of any vertex $\leq 4$ . Obviously if the multigraph has more than one vertex, then any vertex has at most one loop. A vertex of degree $<4$ is called a free vertex, while a vertex of degree 4 is called a dummy vertex. A dummy vertex with loop is called a Ricci vertex, while a dummy vertex without loop is called a complete vertex.

For any vertex $v$ of a connection graph $G$ , associate with it two twin-seats: $S_{1}S_{2}$ and $S_{3}S_{4}$ . Now $v$ together with the two twin-seats associated with it is called a detailed vertex, denoted by $v(S_{1}S_{2},S_{3}S_{4})$ .

For any edge $e_{j}$ of $G$ connecting two vertices $v_{1},v_{2}$ , assign in each $v_{i}$ a seat $s_{ij}$ so that $e_{j}$ connects the two seats $s_{1j}$ and $s_{2j}$ , one from each detailed vertex. If an assignment of all the edges to the seats changes $G$ into a graph with the seats as vertices such that the degree of any seat $\leq 1$ , then the new graph is called a detailed graph of $G$ , and $G$ is called the detail-free multigraph of the detailed graph.

A detailed graph $D$ can be formally multiplied with a scalar $\lambda\in\mathbb{Q}$ , denoted by $\lambda D$ , called a multiple detailed graph. Two multiple detailed graphs can be formally added, and if they are equal up to coefficient, they can be combined by adding up their coefficients. The $\mathbb{Q}$ -space spanned by finitely many detailed graphs is called the detailed graph $\mathbb{Q}$ -space they generate.

Given a detailed graph $D$ , the set of graphs with maximal vertex degree $\leq 1$ and containing $D$ as a subgraph span a $\mathbb{Q}$ -space, called the ideal generated by $D$ . For two detailed graphs $D_{1},D_{2}$ , a third detailed graph $D$ is said to be a join of $D_{1},D_{2}$ , if both $D_{1},D_{2}$ are subgraphs of $D$ . Two detailed graphs can have more than one join.

We use “ $D_{1}D_{2}$ ” to denote a fixed join of $D_{1},D_{2}$ , and use “ $D_{1}D_{2}\cdots D_{k}$ ” to denote a fixed join among $D_{1},D_{2},\ldots,D_{k}$ . This formal product is commutative and associative. For detailed graphs $D_{1},D_{2},\ldots,D_{k}$ , we define the following join of $\lambda_{1}D_{1}+\ldots+\lambda_{k-1}D_{k-1}$ and $\lambda_{k}D_{k}$ , called the multilinear join:

[TABLE]

The multilinear join can be extended by associativity and commutativity to the general case.

Given finitely many detailed vertices, the formal scalar multiplication, addition and multilinear join among them generate a finite dimensional $\mathbb{Q}$ -space of detailed graphs, called the detailed graph $\mathbb{Q}$ -algebra they generate. A general element of the algebra is called a combined detailed graph.

Given a detailed graph $D$ whose seats not connected by edges are said to be free and are labeled by free indices one by one, if an order $O$ among its detailed vertices is given, then following the order all the seats of the detailed vertices can be lined up. Along the sequence of seats $S$ , if we remove all the free seats, and then preserve for each edge only the first seat it connects, we get a sequence of seats $T$ where the serial number of each seat is called the dummy index of the edge connecting it. Now label each seat of $S$ connected by an edge with the dummy index of the edge. The resulting labeled sequence of $S$ is called the serial index representation of $D$ associated with the order $O$ , also called the positional order in [1].

In the serial index representation, the order among the free indices can be prescribed arbitrarily, while the order among the dummy indices follow their natural order as integers. It is always assumed that all free indices $\prec$ all dummy indices. For a detailed graph $D$ of $n$ detailed vertices, there are $n!$ different orders among the detailed vertices, and consequently there are at most $n!$ different serial index representations of $D$ . The minimal serial index representation in the lexicographic order is called the minimal index representation of $D$ , denoted by $\hbox{Min}_{I\hskip-0.56905ptdx}(D)$ .

Given detailed graphs $D_{1},\ldots,D_{k}$ , their various serial index representations span a $\mathbb{Q}$ -linear space, called the serial index $\mathbb{Q}$ -space they generate. For combined detailed graph $\sum_{i=1}^{k}\lambda_{i}D_{i}$ , its minimal index representation refers to $\sum_{i=1}^{k}\lambda_{i}\,\hbox{Min}_{I\hskip-0.56905ptdx}(D_{i})$ .

Given a detailed vertex $v(S_{1}S_{2},S_{3}S_{4})$ , the following 24 detailed vertices

[TABLE]

span a $\mathbb{Q}$ -space, called the detail-free extension of the detailed vertex. It can be represented by the Grassmann exterior product of 24 detailed vertices taken as vectors, if the vectors are linearly independent. No matter how linearly dependent the 24 detailed vertices are, their common detail-free extension can be represented unanimously in parametric form as follows:

[TABLE]

where the $\lambda$ ’s are free parameters. This form of representation is called the detailed extension of the detailed vertex.

Given a detailed graph $D$ with $n$ detailed vertices $\{v_{i}(S_{i1}S_{i2},S_{i3}S_{i4}))\,|\,i=1..n\}$ , or equivalently in the notation of detailed graph $\mathbb{Q}$ -algebra,

[TABLE]

the corresponding detail-free multigraph among the $n$ detail-free extensions of the detailed vertices can be represented as the following “detailed graph” among the $n$ detailed extensions of the detailed vertices:

[TABLE]

called the detailed extension of $D$ . Two detailed graphs are said to be multigraph-like, if they have the same connection multigraph.

So when viewed from the $\mathbb{Q}$ -space spanned by detailed vertices, the detail-free multigraph of a detailed graph $D$ with $n$ detailed vertices each having $m_{i}$ -dimensional detailed extension for $i=1..n$ , is a multigraph whose vertices are $m_{i}$ -dimensional linear subspaces for $i=1..n$ ; when viewed from the $\mathbb{Q}$ -space spanned by detailed graphs, the detail-free multigraph is a $(\prod_{i=1}^{n}m_{i})$ -dimensional linear subspace spanned by all the detailed graphs whose detailed vertices each have the same set of seats with the corresponding detailed vertex of $D$ .

In an R-monomial, if an R-factor has all its dummy indices removed, and all its free indices viewed as a set, then when the remainder is taken as a vertex, and every pair of identical dummy indices is taken as an edge, a multigraph is obtained, called the multigraph of the R-monomial.

An R-monomial is said to be connected if so is its multigraph. The multigraph of a connected R-monomial is a connection multigraph. For an R-monomial $f$ , a connected R-submonomial $h$ is said to be maximal if there is no connected R-submonomial of $f$ containing $h$ properly. The multigraph of an R-monomial is the disjoint union of the connection multigraphs of all the maximal connected R-submonomials.

For example, the connection multigraph of $R^{a\phantom{b,c}d}_{\phantom{a}b,c}R_{a}^{\phantom{a}e,fc}$ is $R_{\{b,d\}}\asymp R_{\{e,f\}}$ , where each arc denotes an edge. The connection multigraph of $R^{a\phantom{b,}c}_{\phantom{a}b,\phantom{c}a}$ is $\stackrel{{\scriptstyle\frown}}{{R}}_{\{b,c\}}$ , while the connection multigraph of $R^{a\phantom{b,a}b}_{\phantom{a}b,a}$ is $\stackrel{{\scriptstyle\stackrel{{\scriptstyle\scriptstyle\frown}}{{\hbox{\it R}}}}}{{\scriptstyle\smile}}$ .

In a connected R-monomial, every R-factor is a 4-tuple of seats, each seat being occupied by an index. When every seat is taken as a vertex, and every pair of identical dummy indices is taken as an edge, a detailed graph is obtained with every R-factor as a detailed vertex, called the detailed graph of the R-monomial. The detailed graph of an R-monomial is the disjoint union of the detailed graphs of all the maximal connected R-submonomials.

For example, the detailed graph of $R^{\phantom{ba,}ac}_{ba,}$ is $R_{b*,}\hskip-8.5359pt\stackrel{{\scriptstyle/}}{{\phantom{\scriptscriptstyle 1}}}^{*c}$ , while the detailed graph of $R^{ab,}_{\phantom{ab,}ab}$ is $R^{\stackrel{{\scriptstyle**,}}{{\phantom{1}}}}_{\stackrel{{\scriptstyle\phantom{1}}}{{\phantom{**}**}}}\hskip-14.22636pt\backslash\backslash$ . Each edge is denoted by a line with two asterisk ends denoting the seats it connects. The serial index representations of the two detailed graphs are $b11c,1212$ respectively, which are also the minimal index representations of them.

3 Detailed Pre-R-graph and Detailed R-graph

Let $G$ be a connection multigraph. Denote by $Detail(G)$ the $\mathbb{Q}$ -space of detailed graphs having the same detail-free expansion $G$ .

Define the following equivalence relation in $Detail(G)$ : two multiple detailed graphs $\mu_{1}D_{1}$ and $\mu_{2}D_{2}$ are equivalent if one of the following is satisfied:

Sym $-$ :

$\mu_{1}+\mu_{2}=0$ , and $D_{1}$ and $D_{2}$ have only one different detailed vertex: if the detailed vertex of $D_{1}$ takes the form $v(S_{1}S_{2},S_{3}S_{4})$ , then the other in $D_{2}$ is $v(S_{2}S_{1},S_{3}S_{4})$ .

Sym+:

$\mu_{1}-\mu_{2}=0$ , and $D_{1}$ and $D_{2}$ have only one different detailed vertex: if the detailed vertex in $D_{1}$ takes the form $v(S_{1}S_{2},S_{3}S_{4})$ , then the other in $D_{2}$ is $v(S_{3}S_{4},S_{1}S_{2})$ .

Two combined detailed graphs $\sum_{i=1}^{m}\lambda_{i}D_{1i}$ and $\sum_{j=1}^{m}\mu_{j}D_{2j}$ are equivalent if for $i=1..m$ , $\lambda_{i}D_{1i}$ and $\mu_{i}D_{2i}$ are equivalent. This equivalence relation is called the pre-R-equivalence. The pre-R-equivalence class of a detailed vertex $v$ , detailed graph $D$ , respectively, is called a detailed pre-R-vertex, detailed pre-R-graph, and denoted by ${pre\hbox{-}\tilde{R}}(v)$ , ${pre\hbox{-}\tilde{R}}(D)$ , respectively.

Sym $\pm$ generate the symmetry group $Sym8$ . From the viewpoint of graph algebra extension, depending on whether a detailed vertex $v$ has two loops or not, the corresponding detailed pre-R-vertex is a 4-D space or 8-D space spanned by the trajectory of group $Sym8$ upon $v$ , called the $Sym8$ -subextension of the detailed vertex. A detailed pre-R-graph is thus a detailed graph whose detailed vertices are $Sym8$ -subextensions of the corresponding detailed vertices of any detailed graph in the pre-R-equivalence class.

Now that the coset $S_{4}/Sym8$ has 3 elements, the detailed extension of a loop-free detailed pre-R-vertex $v(S_{1}S_{2},S_{3}S_{4})$ is a 3-space spanned by the following basis:

[TABLE]

The detailed extension of a detailed pre-R-vertex with loop is a 1-space spanned by itself.

For a detailed pre-R-graph

[TABLE]

the detailed extension $Ext(D)$ is a $3^{n-r}$ -dimensional $\mathbb{Q}$ -space with the following parametric representation:

[TABLE]

where $BP(1,2)$ is the set of bipartitions of 2,3,4 into two subsequences of length 1,2 respectively. The basis in (3.8) is the set of all detailed pre-R-graphs having the same connection multigraph with $D$ .

For a detailed graph $D\in Detail(G)$ , the minimum in the lexicographic order of all the minimal serial representations of elements in ${pre\hbox{-}\tilde{R}}(D)$ is called the pre-normal form of $D$ , denoted by ${pre\hbox{-}normal}(D)$ . For combined detailed graph $\sum_{i=1}^{m}\lambda_{i}D_{i}$ in the detailed graph $\mathbb{Q}$ -space, its pre-normal form is $\sum_{i=1}^{m}\lambda_{i}\ {pre\hbox{-}normal}(D_{i})$ . The pre-normal form provides a unique 1-D representation of the detailed pre-R-graph.

Define the following equivalence relation in $Detail(G)$ : let $D_{1},D_{2},D_{3}$ be detailed graphs, and let $\mu_{1},\mu_{2},\mu_{3}\in\mathbb{Q}$ , then $\mu_{1}D_{1}$ and $\mu_{2}D_{2}+\mu_{3}D_{3}$ are equivalent if one of the following is satisfied:

Pre-R:

$\mu_{2}D_{2}+\mu_{3}D_{3}$ and $\mu_{1}D_{1}$ are pre-R-equivalent.

Bianchi:

$-\mu_{1}=\mu_{2}=\mu_{3}$ , and $D_{1},D_{2},D_{3}$ have only one different detailed vertex: if the detailed vertex in $D_{1}$ takes the form $v(S_{1}S_{2},S_{3}S_{4})$ , then the other two in $D_{2},D_{3}$ separately are $v(S_{1}S_{3},S_{4}S_{2})$ , $v(S_{1}S_{4},S_{2}S_{3})$ respectively.

Two combined detailed graphs $\sum_{i=1}^{m}\lambda_{i}D_{1i}$ and $\sum_{j=1}^{2m}\mu_{j}D_{2j}$ of $Detail(G)$ are equivalent if for $i=1..m$ , $\lambda_{i}D_{1i}$ and $\mu_{2i}D_{2i}+\mu_{m+i}D_{2(m+i)}$ are equivalent. This equivalence relation is called the R-equivalence.

The R-equivalence class of a detailed vertex $v$ , detailed graph $D$ , respectively, is called a detailed R-vertex, detailed R-graph, respectively, and denoted by $\tilde{R}(v)$ , $\tilde{R}(D)$ , respectively. The R-equivalence relation naturally induces an equivalence relation among the detailed pre-R-graphs, also called the R-equivalence: two detailed pre-R-graphs are equivalent if as detailed graphs they are R-equivalent.

Bianchi defines only one linear relation among the basis elements of (3.6). The reason is as follows. If the 4 seats of a vertex are permuted, then there are as many as 24 Bianchi relations:

[TABLE]

for all $\sigma\in S_{4}$ . It is easy to see that all of them are pre-R-equivalent.

So the detailed extension of a loop-free detailed R-vertex $v(S_{1}S_{2},S_{3}S_{4})$ is a 2-space spanned by the following basis: $v(S_{1}S_{2},S_{3}S_{4})$ , $v(S_{1}S_{3},S_{4}S_{2})$ . For detailed R-graph (3.7), $Ext(D)$ is a $2^{n-r}$ -dimensional $\mathbb{Q}$ -space with the following basis:

[TABLE]

(3.10) is the set of all detailed R-graphs having the same connection multigraph with $D$ .

For $D\in Detail(G)$ , the minimum in the lexicographic order of all the pre-normal forms of elements in $\tilde{R}(D)$ is called the R-normal form of $D$ , denoted by $R$ - $normal(D)$ . For combined detailed graph $\sum_{i=1}^{m}\lambda_{i}D_{i}$ , its R-normal form is $\sum_{i=1}^{m}\lambda_{i}\ R\hbox{-}normal(D_{i})$ . The R-normal form provides a unique 1-D representation of the detailed R-graph $\tilde{R}(D)$ , or equivalently the common multigraph $G$ .

For detailed graph $D$ in (3.7), as a detailed pre-R-graph it has a detailed extension of $3^{n-r}$ dimensions, while as a detailed R-graph its detailed extension has dimension $2^{n-r}$ . So the $3^{n-r}$ basis elements in (3.8) when taken as detailed R-graphs satisfy $3^{n-r}-2^{n-r}$ linear constraints. These constraints can be selected as the following $3^{n-r}-2^{n-r}$ equations: for all $r<i\leq n$ , all $\sigma_{j}\in S_{2}$ acting upon 2,3, and all $\sigma_{k}\in BP(1,2)$ acting upon 2,3,4,

[TABLE]

(3.11) can also be obtained from the pre-normal form of the detailed extension (3.8) by $3^{n-r}-2^{n-r}$ special evaluations of the $\lambda$ ’s.

(3.11) is a linear homogeneous system of $3^{n-r}-2^{n-r}$ equations in $m\leq 3^{n-r}-2^{n-r}$ unknowns, where each unknown is a pre-normal R-monomial. Denote the unknowns by $x_{1}\succ x_{2}\succ\cdots\succ x_{m}$ following the lexicographic order. Let ${\bf E}$ be the RREF (reduced row echelon form) of the coefficient matrix. If ${pre\hbox{-}normal}(f)$ is up to coefficient a leading variable of an equality in ${\bf E}(x_{1},x_{2},\ldots,x_{m})^{T}=0$ , then the normal form of $f$ is obtained by substituting the equality into ${pre\hbox{-}normal}(f)$ , else ${pre\hbox{-}normal}(f)$ is the normal form of $f$ .

4 Algorithm for Pre-normal Form

For an R-monomial $f$ , its pre-normal form is an R-monomial $g$ whose index sequence is the pre-normal form of the detailed graph of $f$ . For an R-polynomial, its pre-normal form is the linear combination of the pre-normal forms of its terms. An R-polynomial is said to be pre-normal if it is its own pre-normal form. For example, the pre-normal form of an R-factor $R(ij,kl)$ can be obtained in three steps: (1) sort $i,j$ non-decreasingly, (2) sort $k,l$ non-decreasingly, (3) sort the two sorted pairs non-decreasingly.

We assume that the input R-polynomial does not contain indices named after integers, so that we can introduce integers as new dummy indices. We always assume

[TABLE]

Before introducing the pre-normal form computing algorithm, let us check three typical examples.

Example 1. Let $f=R_{d_{1}\phantom{d_{2},d_{6}}d_{7}}^{\phantom{d_{1}}d_{2},d_{6}}R_{\phantom{d_{3}d_{4},d_{7}}d_{6}}^{d_{3}d_{4},d_{7}}R^{d_{1}d_{5},}_{\phantom{d_{1}d_{5},}d_{2}a}R^{b}_{\phantom{b}d_{4},d_{3}d_{5}}$ , where $a\prec b\prec d_{1}\prec\ldots\prec d_{7}$ .

First, change each R-factor into its pre-normal form. The result is $R(d_{1}d_{2},d_{6}d_{7})$ , $-R(d_{3}d_{4},d_{6}d_{7})$ , $-R(ad_{2},d_{1}d_{5})$ , $R(bd_{4},d_{3}d_{5})$ .

Second, the R-factors are classified into two groups: the first group $Q_{F}=\{-R(ad_{2},d_{1}d_{5}),R(bd_{4},d_{3}d_{5})\}$ consists of R-factors having free indices, the second group $Q_{D}=\{R(d_{1}d_{2},d_{6}d_{7}),-R(d_{3}d_{4},d_{6}d_{7})\}$ consists of the rest. Obviously all elements of $Q_{F}\prec$ all elements of $Q_{D}$ .

Third, $Q_{F}$ is sorted by free indices: $-R(ad_{2},d_{1}d_{5})\prec R(bd_{4},d_{3}d_{5})$ , making $Q_{F}$ a sequence. The serial index representation of $Q_{F}$ is then obtained by the following assignment:

[TABLE]

The assignment naturally branches into two options: (1) $d_{1}\rightarrow 2,\ d_{5}\rightarrow 3$ ; (2) $d_{1}\rightarrow 3,\ d_{5}\rightarrow 2$ . Within sequence $Q_{F}$ , option (1) gives index sequence $-a123,-b435$ while option (2) gives index sequence $-a123,-b425$ . As option (2) gives lower order, it becomes the single option.

Fourth, the corresponding old dummy indices in group $Q_{D}$ are also renamed by the new ones. While $Q_{F}$ is updated to $-R(a1,23),-R(b4,25)$ , $Q_{D}$ is updated to $\{-R(13,d_{6}d_{7}),R(45,d_{6}d_{7})\}$ . Obviously $-R(13,d_{6}d_{7})\prec R(45,d_{6}d_{7})$ . Now that all elements of $Q_{F}$ and $Q_{D}$ are ordered, they can merge to form a single sequence. In other words, the order among the 4 input R-monomials have been fixed.

Finally, the remaining old dummy indices $d_{6}d_{7}$ are assigned to new ones $67$ . No matter whether $d_{6}$ is renamed to 6 or 7, the resulting R-monomial is the same: $-R(a1,23)R(b4,25)R(13,67)R(45,67)$ . It is the pre-normal form of $f$ .

Example 1 shows that if $f$ has free indices, then its pre-normal form can be computed by the loop procedure of first sorting the R-factors having fixed indices, then renaming the old dummy indices in the sorted R-factor sequence by the serial index representation. The same idea applies to the case when there is no free index.

Lemma 1

If connected R-monomial $f$ has no free index but has at least one Ricci R-factor, then the pre-normal form of $f$ must be led by a Ricci R-factor.

Proof. Any Ricci R-factor in the first position can be renamed as $R(11,23)$ or $R(12,13)$ , while a non-Ricci R-factor has the minimal index form $R(12,34)$ , which is higher in order. $\square$

Example 2. $f=R_{d_{1}\phantom{d_{2},d_{3}}d_{4}}^{\phantom{d_{1}}d_{2},d_{3}}R^{d_{5}d_{4},}_{\phantom{d_{5}d_{4},}d_{5}d_{3}}R_{d_{2}d_{6},}^{\phantom{d_{2}d_{6},}d_{1}d_{6}}$ .

The R-factors in pre-normal form are $R(d_{1}d_{2},d_{3}d_{4}),R(d_{3}d_{5},d_{4}d_{5}),R(d_{1}d_{6},d_{2}d_{6})$ . By Lemma 1, one of the two Ricci R-factors is the first in the pre-normal form of $f$ . For example, if $R_{d_{3}d_{5},d_{4}d_{5}}$ is the first, then $d_{5}\rightarrow 1$ and $d_{3}d_{4}\rightarrow 23$ . So 4 branches are generated in assigning new indices 1,2,3:

[TABLE]

In each branch, the sequence of R-monomials containing fixed indices is denoted by $Q_{F}$ , and the other R-monomials are in the set $Q_{D}$ :

[TABLE]

In branch (1), assignment $d_{1}d_{2}\rightarrow 45$ generates one more branch: while option $d_{1}\rightarrow 4,d_{2}\rightarrow 5$ leads to $R(12,13)R(23,45)R(4d_{6},5d_{6})$ , option $d_{1}\rightarrow 5,d_{2}\rightarrow 4$ leads to $-R(12,13)R(23,45)R(4d_{6},5d_{6})$ . Since the two differ by coefficient, $f=0$ .

Example 2 shows that depth-first strategy is preferred in generating branches.

Example 3. Set $a=b$ in Example 1.

The R-factors of $f$ are $R(d_{1}d_{2},d_{6}d_{7})$ , $-R(d_{3}d_{4},d_{6}d_{7})$ , $-R(ad_{2},d_{1}d_{5})$ , $R(ad_{4},d_{3}d_{5})$ . Any of them may be the first in the pre-normal form of $f$ . For example, if $R(d_{1}d_{2},d_{6}d_{7})$ is the first, then assignment $d_{1}d_{2}d_{6}d_{7}\rightarrow 1234$ has 24 options. All together the assignment of new indices 1,2,3,4 generates $4\times 24=96$ branches.

Once the first new indices are assigned, then similar to Example 1, more new indices can be assigned in each branch, and new branches may be generated by different options in the new assignment. Each branch finally generates a serial index representation of $f$ , and the minimum of them gives the pre-normal form of $f$ . By the algorithm below, the pre-normal form is $R(12,34)R(12,56)R(37,48)R(57,68)$ .

An R-factor is said to be free if it contains any free index, otherwise it is said to be dummy. A dummy R-factor is said to be complete if it is not Ricci.

In the following,

•

$J$ records the number of branches generated,

•

$K$ records the serial number of the current branch in process,

•

$I[k]-1$ records the number of different new indices introduced in the $k$ -th branch,

•

$Q_{F}[k]$ records the sequence of R-factors having fixed indices in the $k$ -th branch,

•

$Q_{D}[k]$ records the set of R-factors not in $Q_{F}[k]$ ,

•

$V[k]$ records the set of fixed indices in the $k$ -th branch.

Complexity analysis:

Lemma 2

Let $N$ be the total number of branches generated in pnom $(f)$ , where $f$ is a connected R-monomial of degree $n$ . then $N=O(n2^{n})$ .

Proof. Let the number of free, dummy Ricci, complete, R-factors in $f$ be $e,l,c$ , respectively. Then $e+l+c=n$ . The following are trivial facts:

•

If $Y$ is a free R-factor, then it generates at most two branches in both pnom and SerIdx; for example, this can happen when $Y=\lambda R(f_{1}f_{2},b_{2}b_{3})$ or $\lambda R(f_{1}b_{1},b_{2}b_{3})$ , where the $f_{i}$ are fixed indices and the $b_{j}$ are old dummy indices.

•

If $f$ has no free R-factor but has Ricci ones, then in pnom, the leading Ricci R-factor has $2l$ options, while in SerIdx a Ricci R-factor never generates any new branch.

•

If $f$ has only complete R-factors, then in pnom, the leading complete R-factor has $24c$ options, while in SerIdx a complete R-factor generates at most one more branch, just as a free R-factor does.

When $e\neq 0$ , then $N\leq 2^{e}\times 2^{c}=O(2^{n})$ . When $e=0$ but $l\neq 0$ , then $N\leq(2l)\times 2^{c}=O(n2^{n})$ ; when $e=l=0$ , then $N\leq(24c)\times 2^{c-1}=n2^{n-1}$ . $\square$

In pnom, generating a complete branch takes $O(n)$ operations. By Lemma 2, the complexity of pnom is $O(n^{2}2^{n})$ .

5 Algorithm for Normal Form

The normal form of an R-monomial $f$ is an R-polynomial whose index sequence is the R-normal form of the detailed graph of $f$ . For an R-polynomial, its normal form is the linear combination of the normal forms of its terms.

The extension of $f$ , denoted by $Ext(f)$ , is an R-polynomial whose detailed graph is the detailed extension of the detailed graph of $f$ . All the R-monomials having the same connection multigraph with $f$ are terms of $Ext(f)$ up to coefficient.

To see how $Ext(f)$ can help computing $normal(f)$ , let us check a well-known example [2].

Example 4. Let $f=R(12,34)R(13,24)$ ; it is already in pre-normal form.

First, $f=D$ in (3.8) where $n=2$ , $r=0$ , and $v_{1}=v_{2}=R$ . So

[TABLE]

Let $x_{1}=R(12,34)R(13,24)=f$ and $x_{2}=R(12,34)R(12,34)$ . Then $x_{1}\succ x_{2}$ in lexicographic order. Setting $\lambda_{0}=\lambda_{12}=\lambda_{13}=\lambda_{14}=1$ in (5.12), we get the following Bianchi relation on the first R-factor of $f$ :

[TABLE]

Setting $\lambda_{0}=\lambda_{22}=\lambda_{23}=\lambda_{24}=1$ in (5.12), we get the following Bianchi relation on the second R-factor of $f$ :

[TABLE]

Solving the two equations in variables $x_{1},x_{2}$ , we get the solution $x_{2}=x_{1}/2$ . So $x_{1}/2$ is the normal form of $f$ .

Example 4 suggests the following procedure of normal form computing: Let connected R-monomial $f$ take the form (3.7) where every $v_{i}$ represents $R$ , then the pre-normal form of its extension (3.8) is evaluated to zero $n$ times under $n$ special evaluations of the parameters $\lambda$ ’s: for $i=1..n$ , the $i$ -th evaluation is by setting $\lambda_{r}=\lambda_{i2}=\lambda_{i3}=\lambda_{i4}=1$ . Denote by $\Lambda$ the set of parameters $\lambda$ ’s in $Ext(f)$ other than $\lambda_{r}$ . Then $\#\Lambda=3^{n-r}$ . Let there be $m$ different pre-normal R-monomials in ${pre\hbox{-}normal}(Ext(f))$ . Denote them by $x_{1}\succ x_{2}\succ\cdots\succ x_{m}$ following the lexicographic order. That ${pre\hbox{-}normal}(Ext(f))=0$ under the above $n$ special evaluations gives $n$ linear equations

[TABLE]

where ${\bf x}=(x_{1},x_{2},\ldots,x_{m})^{T}$ , and ${\bf A}_{n\times m}$ is a matrix whose entries are in ${\mathbb{Q}}[\Lambda]$ . Solving the linear system by Gauss-Jordan elimination, one gets the RREF ${Rre\hskip-1.13791ptf}_{1}$ : $x_{j}+\sum_{k>j}\mu_{k}x_{k}=0$ for $j\in I\subseteq\{1,2,\ldots,m\}$ , where $\mu_{k}\in{\mathbb{Q}}(\Lambda)$ .

In each equation of ${Rre\hskip-1.13791ptf}_{1}$ , divide the expression on the left side into two sub-expressions: sub-expression 1 contains the terms with coefficient in $\mathbb{Q}$ , and sub-expression 2 contains the rest. Denote by $sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{1})$ the union of the sub-expression 1’s from the equations of ${Rre\hskip-1.13791ptf}_{1}$ , and denote by $sub_{\check{\mathbb{Q}}}({Rre\hskip-1.13791ptf}_{1})$ the union of the sub-expression 2’s. Then each element of $sub_{\check{\mathbb{Q}}}({Rre\hskip-1.13791ptf}_{1})$ must be equal to zero, and we get a set of at most $\#I$ linear equations in the variables $x_{l}$ where $l$ is in a subset of $\{1,2,\ldots,m\}\backslash I$ . Computing the RREF of this linear system, we get ${Rre\hskip-1.13791ptf}_{2}$ . Continuing the selection of terms with coefficient $\notin\mathbb{Q}$ in ${Rre\hskip-1.13791ptf}_{i}$ and the computing of an RREF ${Rre\hskip-1.13791ptf}_{i+1}$ of the selected new linear system, we finally get a complete RREF with coefficient in $\mathbb{Q}$ , which is composed of the $sub_{\mathbb{Q}}$ ’s in the ${Rre\hskip-1.13791ptf}$ ’s.

Proposition 1

The complete RREF obtained by the above procedure is the RREF of linear system (3.11).

Proof. Let ${\bf C}{\bf x}=0$ be the complete RREF obtained from ${Rre\hskip-1.13791ptf}_{1},{Rre\hskip-1.13791ptf}_{2},\ldots,{Rre\hskip-1.13791ptf}_{N}$ . Then any equation in ${Rre\hskip-1.13791ptf}_{N}$ has all the coefficients in $\mathbb{Q}$ , i.e., $sub_{\check{\mathbb{Q}}}({Rre\hskip-1.13791ptf}_{N})=\emptyset$ . Since ${Rre\hskip-1.13791ptf}_{N}$ is obtained from $sub_{\check{\mathbb{Q}}}({Rre\hskip-1.13791ptf}_{N-1})=0$ by elementary row transformations with coefficients in ${\mathbb{Q}}(\Lambda)$ , every equation of $sub_{\check{\mathbb{Q}}}({Rre\hskip-1.13791ptf}_{N-1})$ is a ${\mathbb{Q}}(\Lambda)$ -linear combination of the equations in ${Rre\hskip-1.13791ptf}_{N}$ . We use $sub_{\check{\mathbb{Q}}}({Rre\hskip-1.13791ptf}_{N-1})\subseteq\langle sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{N})\rangle_{{\mathbb{Q}}(\Lambda)}$ to denote this relation.

Similarly, $sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{N-1})$ also has all its coefficients in $\mathbb{Q}$ , and

[TABLE]

Continuing this argument, we get

[TABLE]

Let ${\bf B}{\bf x}=0$ be a linear system obtained from ${\bf A}{\bf x}=0$ by $M\geq 3^{n-r}$ different generic ${\mathbb{Q}}$ -specifications of the parameters in $\Lambda$ . Then ${\bf B}$ has $nM$ rows. Let ${\bf D}{\bf x}=0$ be the linear system (3.11) of $3^{n}-2^{n}$ rows. Denote by $\langle{\bf P}\rangle_{\mathbb{Q}}$ the row space of a matrix ${\bf P}$ .

Then $\langle{\bf B}\rangle_{\mathbb{Q}}=\langle{\bf D}\rangle_{\mathbb{Q}}$ . By (5.13), $\langle{\bf B}\rangle_{\mathbb{Q}}\subseteq\langle{\bf C}\rangle_{\mathbb{Q}}$ . So $\langle{\bf D}\rangle_{\mathbb{Q}}\subseteq\langle{\bf C}\rangle_{\mathbb{Q}}$ .

Conversely, let ${\bf E}$ be the RREF of ${\bf D}$ , let the equations in ${\bf E}{\bf x}=0$ be $x_{j}+\sum_{k>j}\mu_{k}x_{k}=0$ for $k\in J\subseteq\{1,2,\ldots,m\}$ , where $\mu_{k}\in\mathbb{Q}$ . Then for all $j\notin J$ , $x_{j}\in{{\mathbb{Q}}[\{x_{l}\,|\,l\in J\}]}$ and the linear dependency is represented explicitly by a row of ${\bf E}$ .

Let there be an equation $Eq\in{Rre\hskip-1.13791ptf}_{1}$ in which $sub_{\check{\mathbb{Q}}}\neq 0$ . Then $Eq$ is of the form $c+f+g/h=0,$ where

(1) $c=sub_{\mathbb{Q}}\in{{\mathbb{Q}}[\{x_{l}\,|\,l\in J\}]}$ and is linear in the $x_{l}$ for $l\in J$ ;

(2) $f,g\in{{\mathbb{Q}}[\{x_{l}\,|\,l\in J\}]}[\Lambda]$ and are both linear in the $x_{l}$ for $l\in J$ ;

(3) $h\in{\mathbb{Q}}[\Lambda]$ ;

(4) either $f=0$ or $f$ has no term in ${{\mathbb{Q}}[\{x_{l}\,|\,l\in J\}]}$ ;

(5) either $g=0$ or no term of $g$ can be divided by $h$ ;

(6) $sub_{\check{\mathbb{Q}}}=f+g/h\neq 0$ .

Using the rows of ${\bf E}$ to make elementary row transformations to $Eq$ , it is easy to see that $Eq$ is changed into

[TABLE]

where

(1) $c=\sum_{k\in J}\gamma_{k}x_{k}$ , $f=\sum_{k\in J}\alpha_{k}x_{k}$ , $g=\sum_{k\in J}\beta_{k}x_{k}$ ;

(2) $\gamma_{k}\in\mathbb{Q}$ ;

(3) either $\alpha_{k}=0$ or every term of $\alpha_{k}\in{\mathbb{Q}}[\Lambda]$ has degree $>0$ ;

(4) either $\beta_{k}=0$ or no term of $\beta_{k}\in{\mathbb{Q}}[\Lambda]$ can be divided by $h$ .

We prove that for all $k\in J$ , $\alpha_{k}=\beta_{k}=0$ .

When $Eq\in{Rre\hskip-1.13791ptf}_{1}$ is replaced by the corresponding $M$ equations $Eq_{i}\in{Rre\hskip-1.13791ptf}_{1}({\bf A}|_{i})$ for $i=1..M$ , we get $M$ equations

[TABLE]

where $h|_{i}=h$ by generic specification $i$ , and so for the $\alpha_{k}|_{i},\beta_{k}|_{i}$ . These equations can all be obtained from ${\bf E}{\bf x}=0$ by $\mathbb{Q}$ -coefficient elementary row transformations.

If for some $k\in J$ , $\alpha_{k}+\beta_{k}/h\neq 0$ , then in the $(3^{n-r}+1)$ -dimensional vector space with coordinates $(\Lambda,y)$ , hyperplane $y=-\gamma_{k}$ generically does not meet hypersurface $y=\alpha_{k}+\beta_{k}/h$ . This means we can choose a generic specification $i$ such that $\gamma_{k}+\alpha_{k}|_{i}+(\beta_{k}|_{i})/(h|_{i})\neq 0$ . Under this generic specification, (5.14) is a nontrivial linear relation among the $x_{l}$ where $l\in J$ . This violates the RREF property of ${\bf D}$ that the $\{x_{k}\,|\,k\in J\}$ are $\mathbb{Q}$ -linearly independent.

So $\alpha_{k}+\beta_{k}/h$ is identical to zero for all $k\in J$ , hence $f+g/h$ is identical to zero. This means $c=0$ is obtained from $Eq$ by applying some ${\mathbb{Q}}(\Lambda)$ -coefficient elementary row transformations induced by ${\bf E}$ .

Applying this argument to all equations of ${Rre\hskip-1.13791ptf}_{1}$ , we get that all elements of $sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{1})$ and $sub_{\check{\mathbb{Q}}}({Rre\hskip-1.13791ptf}_{1})$ can be obtained from ${\bf D}$ by ${\mathbb{Q}}(\Lambda)$ -coefficient elementary row transformations.

Continue this argument to ${Rre\hskip-1.13791ptf}_{2}$ , then inductively to all ${Rre\hskip-1.13791ptf}_{j}$ for $j>1$ , till $j=N$ . In the end, $\cup_{j=1}^{N}sub_{\mathbb{Q}}({Rre\hskip-1.13791ptf}_{j})$ can be obtained from ${\bf D}$ by ${\mathbb{Q}}(\Lambda)$ -coefficient elementary row transformations.

Now make $M$ generic specifications of $\Lambda$ to turn the ${\mathbb{Q}}(\Lambda)$ -coefficient elementary row transformations into ${\mathbb{Q}}$ -coefficient ones. We finally get $\langle{\bf C}\rangle_{\mathbb{Q}}\subseteq\langle{\bf D}\rangle_{\mathbb{Q}}$ . $\square$

As a corollary, if ${pre\hbox{-}normal}(f)$ is up to coefficient a leading variable $x_{j}$ in an equation $x_{j}+\sum_{k>j}\lambda\mu_{k}x_{k}=0$ of the complete RREF, say ${pre\hbox{-}normal}(f)=\lambda x_{j}$ , then the normal form of $f$ is $\sum_{k>j}(-\lambda\mu_{k})x_{k}$ , else ${pre\hbox{-}normal}(f)$ is the normal form of $f$ .

Complexity analysis:

Proposition 2

Let $f$ be a connected R-monomial of degree $n$ . Then normal $(f)$ takes $O(n9^{n})$ operations.

Proof. Let $f$ be in the form of (3.7), then $Ext(f)$ has $3^{n-r}$ terms. Computing the pre-normal form of $Ext(f)$ takes $O(n^{2}6^{n})$ operations.

Let $m$ be the number of R-monomials in ${pre\hbox{-}normal}(Ext(f))$ after like term combination. Then $m=O(3^{n})$ . In Rebe, computing the RREF ${Rre\hskip-1.13791ptf}_{1}$ with coefficient field ${\mathbb{Q}}(\Lambda)$ takes $O(n^{2}m)$ arithmetic operations upon multivariate rational functions.

Let $r={\rm rank}({Rre\hskip-1.13791ptf}_{1})$ , then $sub_{\check{\mathbb{Q}}}({Rre\hskip-1.13791ptf}_{1})$ when written as a matrix has the size of $r\times(m-r)$ at most, so computing ${Rre\hskip-1.13791ptf}_{2}$ takes $O(r^{2}(m-r))$ operations, which is $O(n^{2}(m-n))$ when $m\gg n$ . Going this way, computing ${Rre\hskip-1.13791ptf}_{i+1}$ takes $O(n^{2}(m-in))$ operations, till $i=[m/n]$ . Since

[TABLE]

computing the complete RREF takes $O(nm^{2})$ operations, which is $O(n9^{n})$ when $m=\Theta(3^{n})$ . The overall complexity is thus $O(n9^{n})$ . $\square$

Function Rebe in pnom can be replaced by other methods for computing the RREF of (3.11). Direct solving of (3.11) by Gauss-Jordan elimination has complexity $O(m9^{n})$ . Although it is not fair to take the complexity of an arithmetic operation on rational functions as the same with that on rational numbers, function Rebe reduces the equation-solving complexity to $O(nm^{2})$ , which is polynomial in the case when $m$ is the size of a polynomial in $n$ .

Notice that every row of (3.11) has at most three nonzero entries. It may be possible that the iterative methods for sparse $\mathbb{R}$ -linear systems be applied to the sparse $\mathbb{Q}$ -linear system (3.11) for infinitely many accurate $\mathbb{Q}$ -valued solutions by solving the corresponding normal equations [19], so that the sparse solving has complexity $O(3^{n})$ . Even so, Rebe is valuable in the case when $m=O(3^{n/2})$ .

6 Conclusion

In this paper we establish the graph algebra extension theory and develop an algorithm of normalizing Riemann tensor polynomials based on this theory. The theory can be extended to the case involving covariant derivatives of the Riemann tensor, and other types of tensor in a straightforward way. Future work includes such extensions, and application to theorem proving in Riemannian geometry.

Bibliography19

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Balfagon A, and Jaen X. TTC: Symbolic tensor calculus with indices Comput. Phys. , 1998, 12 (3): 286-289.
2[2] Fulling SA, King RC, Wybourne BG, and Cummins CJ. Normal forms for tensor polynomials: I. The Riemann tensor, Class. Quantum. Grav. , 1992, 9 : 1151-1197.
3[3] Ilyin VA, and Kryukov AP. Symbolic simplification of tensor expressions using symmetries, dummy indices and identities, Proc. ISSAC’91 : 224-228.
4[4] Ilyin VA, and Kryukov AP. ATENSOR-REDUCE program for tensor simplification, Comput. Phys. Commun. , 1996, 96 : 36-52.
5[5] Kavian M, Mc Lenaghan RG, and Geddes KO. Application of genetic algorithms to the algebraic simplification of tensor polynomials, Proc. ISSAC’97 , Maui, Hawaii, USA, pp. 93-100.
6[6] Li Z, Shao S, and Liu W. Classifications and canonical forms of tensor product expressions in the presence of permutation symmetries. ar Xiv:1604.06156 v 1 [physics.chem-ph] Apr. 2016.
7[7] Liu J, Li H, and Zhang L. A complete classification of canonical forms of a class of Riemann tensor indexed expressions and its applications in differential geometry (in Chinese), Sci. Sin. Math. , 2013, 43 : 399-408.
8[8] Liu J. Normalization in Riemann tensor polynomial ring. In: China Computer Algebra Conference 2016 , Shenzhen, Nov. 10-11, 2016.