\standaloneconfig
mode=buildnew
Change of basis for m-primary ideals in one and two variables
Seung Gyu Hyun
University of Waterloo, Waterloo, ON, Canada ([email protected])
Stephen Melczer
University of Pennsylvania, 209 S. 33rd Street, Philadelphia, PA, USA ([email protected])
Éric Schost
University of Waterloo, Waterloo, ON, Canada ([email protected])
Catherine St-Pierre
University of Waterloo, Waterloo, ON, Canada ([email protected])
Abstract
Following recent work by van der Hoeven and Lecerf (ISSAC 2017), we
discuss the complexity of linear mappings, called untangling and
tangling by those authors, that arise in the context of computations
with univariate polynomials. We give a slightly faster tangling
algorithm and discuss new applications of these techniques. We
show how to extend these ideas to bivariate settings, and use them to give bounds on the arithmetic complexity of certain
algebras.
1 Introduction
In [22], van der Hoeven and Lecerf gave
algorithms for “modular composition” modulo powers of polynomials:
that is, computing F(G)modTμ, for polynomials F,G,T over a
field F and positive integer μ. As an intermediate result, they
discuss a linear operation and its inverse, which they
respectively call untangling and tangling.
Given separable T∈F[x] of degree d and a positive integer
μ, polynomials modulo Tμ can naturally be written in the
power basis 1,x,…,xdμ−1. Here we consider another representation,
based on bivariate polynomials. Introduce K:=F[y]/⟨T(y)⟩ with α the residue class of y; then, as an F-algebra, F[x]/⟨Tμ⟩
is isomorphic to K[ξ]/⟨ξμ⟩ and untangling and
tangling are the corresponding change of bases that maps x to
ξ+α. Take, for instance,
F=Q, T=x2+x+2 and μ=2. Then K=Q[y]/⟨y2+y+2⟩;
untangling is the isomorphism Q[x]/⟨x4+2x3+5x2+4x+4⟩→K[ξ]/⟨ξ2⟩ and tangling is its inverse.
We now assume that 2,…,μ−1 are units in F. Van der Hoeven
and Lecerf gave algorithms of quasi-linear cost for both untangling
and tangling; their algorithm for tangling is slightly slower than
that for untangling. Our first contribution is an improved algorithm
for tangling, using duality techniques inspired
by [33]. This saves logarithmic factors compared to the
results in [22]; it may be minor in practice, but we believe
this offers an interesting new point of view. Then we discuss how
these techniques can be of further use, as in the resolution of
systems of the form F(x1,x2,x3)=G(x1,x2,x3)=0, for polynomials
F,G in F[x1,x2,x3].
Our second main contribution is an extension of these algorithms to
situations involving more than one variable. As a first step, in this
paper, we deal with certain systems in two variables.
Indeed, the discussion in [22] is closely related to the
question of how to describe isolated solutions of systems of
polynomial equations. This latter question has been the subject of
extensive work in the past; answers vary depending on what information
one is interested in.
For the sake of this discussion, suppose we consider polynomials
G1,…,Gs in the variables x1 and x2, with coefficients in F. If
one simply wants to describe set-theoretically the (finitely many)
isolated solutions of G1,…,Gs, popular choices include
description by means of univariate
polynomials [27, 9, 20, 2, 32],
or triangular representations [36, 3]. When all isolated solutions
are non-singular nothing else is needed, but further questions arise
in the presence of multiple solutions as univariate or triangular
representation may not be able to describe the local algebraic structure
at such roots.
The presence of singular isolated solutions means that the ideal
⟨G1,…,Gs⟩ admits a zero-dimensional primary
component that is not radical. Thus, let I be a
zero-dimensional primary ideal in F[x1,x2] with radical
m; we will suppose that F[x1,x2]/m is separable
(which is always the case if F is perfect, for instance) to
prevent m from acquiring multiple roots over
an algebraic closure F of F.
A direct approach to describing the solutions of I, together with the
algebraic nature of I itself, is to give one of its Gröbner
bases. Following [28], one may also give a basis of the
dual of F[x1,x2]/I, or a standard basis of
I. In [28, Section 5], Marinari, Möller and Mora make
the following interesting suggestion: build the field
K:=F[y1,y2]/m~, where m~ is the ideal
m with variables renamed y1,y2. Then the polynomials in I
vanish at α:=(α1,α2) when α1,α2 are the residue classes of
y1,y2 in K. Now extend I to the
polynomial ring K[ξ1,ξ2], for new variables ξ1,ξ2, by
mapping (x1,x2) to (ξ1,ξ2). Then, the local structure of
I at α can be described by the primary component of this
extended ideal at α.
Let us show the similarities of this idea with van der Hoeven and
Lecerf’s approach, on an example from [30]. We take
F=Q, m to be the maximal ideal ⟨T1,T2⟩, with
T1:=x12+x1+2, T2:=x2−x1−1, and I=m2 to be the m-primary
ideal with generators
[TABLE]
Since T2 has degree
one in x2, we can simply take K:=Q[y1]/⟨y12+y1+2⟩, α1 to be the residue class of y1 and
α2=α1+1.
The (α1,α2)-primary component J of the extension of
I in K[ξ1,ξ2], i.e., the primary component associated to the
prime ideal (ξ1−α1,ξ2−α2), is the ideal with lexicographic
Gröbner basis
[TABLE]
Its structure appears more clearly after applying the translation
(ξ1,ξ2)↦(ξ1+α1,ξ2+α2): the
translated ideal J′ admits the very simple Gröbner basis ⟨ξ12,ξ1ξ2,ξ22⟩. In other words, this
representation allows one to complement the set-theoretic description of
the solutions by the multiplicity structure.
Our first result in bivariate settings is the relation between the
Gröbner bases of I and J (or J′): in our example, they both
have three polynomials, and their leading terms are related by the
transformation (ξ1,ξ2)↦(x12,x2). We then prove
that, as in the univariate case, there is an F-algebra isomorphism
F[x1,x2]/I→K[ξ1,ξ2]/J′ given by (x1,x2)↦(ξ1+α1,ξ2+α2). In our example, this means that
Q[x1,x2]/⟨G1,G2,G3⟩ is isomorphic to
K[ξ1,ξ2]/⟨ξ12,ξ1ξ2,ξ22⟩.
Under certain assumptions on J′, we give algorithms for this
isomorphism and its inverse that extend those for univariate
polynomials; while their runtimes are not always quasi-linear, they are
subquadratic in the degree of I (that is, the dimension of
F[x1,x2]/I). We end with a first application: upper bounds
on the cost of arithmetic operations in an algebra such as
F[x1,x2]/I; these are new, to the best of our knowledge. Note that with a strong regularity assumption and in a different setting, it has been shown in [35] that multiplication in F[x1,x2]/I can be done in quasi-linear time.
Although our results are still partial (we make assumptions and deal only with bivariate systems), we believe it is
worthwhile to investigate these questions. In future work, we plan to
examine the impact of these techniques on issues arising from
polynomial system solving algorithms: a direction that one may
consider are lifting techniques in the presence of multiplicities, as
in [21] for instance, as well as the computation of GCDs
modulo ideals such as I above. See, for instance, [13] for
a discussion of the latter question.
2 Preliminaries
In the rest of this paper, F is a perfect field. The costs of
all our algorithms are measured in number of operations
(+,−,×,÷) in F.
2.1.
We let M:N→N be such that product of elements
of degree less than n in F[x] can be computed in M(n)
operations, and such that M satisfies the super-linearity
properties of [17, Chapter 8].
Below, we will freely use all usual consequences of fast
multiplication (on fast GCD, Newton iteration, …) and refer the
reader to e.g. [17] for details. In particular,
multiplication in an F-algebra of the form A:=F[x]/⟨T(x)⟩ with T monic in x, or A:=F[x1,x2]/⟨T1(x1),T2(x1,x2)⟩ with T1 monic in x1 and T2 monic
in x2, can be done in time O(M(δ)), with δ:=dimF(A). Inversion, when possible, is slower by a logarithmic factor.
For A=F[x1,x2]/I, for a zero-dimensional monomial ideal I,
multiplication and inversion in A can be done in time
O(M(δ)log(δ)), resp. O(M(δ)log(δ)2),
with δ=dimF(A) (see the appendix).
2.2.
We will use the transposition
principle [10, 23], which is an algorithmic theorem
stating that if the F-linear map encoded by an n×m matrix
over F can be computed in time T, the transposed map can be
computed in time T+O(n+m). This result has been used in a variety
of contexts; our main sources of inspiration
are [33, 7].
2.3.
If A is an F-vector space, its dual A∗:=HomF(A,F)
is the F-vector space of F-linear mappings A→F. When
A is an F-algebra, A∗ becomes an A-module: to a linear
mapping ℓ:A→F and F∈A we can associate the linear
mapping F⋅ℓ:G∈A↦ℓ(FG). This operation is
called the transposed product in A∗, since it is the transpose
of the multiplication-by-F mapping.
Given a basis B of A, elements of A∗ are
represented on the dual basis, by their values
on B. In terms of complexity, if A is an algebra such
as those in 2.1, the transposition principle implies that
transposed products can be done in time O(M(δ)),
resp. O(M(δ)log(δ)), with again
δ:=dimF(A). See [34] for detailed algorithms in
the cases A=F[x]/⟨T(x)⟩ and A=F[x1,x2]/⟨T1(x1),T2(x1,x2)⟩.
An element ℓ∈A∗ is called a generator of A∗ if A⋅ℓ=A∗ (in other words, for any ℓ′ in A∗ there exists F∈A, which must be unique, such that F⋅ℓ=ℓ′). When A=F[x]/⟨T(x)⟩, with n:=deg(T),
ℓ defined by ℓ(1)=⋯=ℓ(xn−2)=0 and
ℓ(xn−1)=1 is known to generate A∗. For
A=F[x1,x2]/⟨T1(x1),T2(x1,x2)⟩, ℓ given
by ℓ(x1n1−1x2n2−1)=1, with all other ℓ(x1ix2j)=0, is a generator (here, we write n1:=deg(T1,x1) and
n2:=deg(T2,x2)). For more general A, A∗ may not be
free: see for example Subsection 4.4.
3 The univariate case revisited
In this section, we work with univariate polynomials. Suppose that T∈F[x] is monic and separable (that is, without repeated roots in F) with degree d, and let μ be an integer positive. We start from the following hypothesis:
H1.
F has characteristic at least μ.
Define K:=F[y]/T(y),
and let α be the residue class of y in K. Van der Hoeven
and Lecerf proved that the F-algebra mapping
[TABLE]
is well-defined and realizes an isomorphism of F-algebras.
The mapping πT,μ is called untangling, and its inverse πT,μ−1 tangling.
Note that πT,μ(F) simply computes the first μ terms of the Taylor expansion
of F at α, that is, πT,μ(F)=∑0≤i<μF(i)(α)ξi/i!.
Reference [22] gives algorithms for both
untangling and tangling, the latter calling the former recursively;
the untangling algorithm runs in O(M(dμ)log(μ)) operations
in F, while the tangling algorithm takes O(M(dμ)log(μ)2+M(d)log(d)) operations. Using transposition techniques
from [33], we prove the following.
Proposition 3.1**.**
Given G in K[ξ]/⟨ξμ⟩, one can compute
πT,μ−1(G) in O(M(dμ)log(μ)+M(d)log(d)) operations in F.
The F-algebra K admits the basis
(1,…,αd−1); F[x]/⟨Tμ⟩ has
basis B=(1,x,…,xdμ−1) and
K[ξ]/⟨ξμ⟩ admits the bivariate basis
C=(1,…,αd−1,ξ,…,αd−1ξ,…ξμ−1,…,αd−1ξμ−1).
As per 2.3, we represent a linear form
L∈F[x]/⟨Tμ⟩∗ by the vector
[L(xi)∣0≤i<dμ] ∈ Fdμ,
and a linear form ℓ∈K[ξ]/⟨ξμ⟩∗
by the bidimensional vector
[ℓ(αiξj) ∣ 0≤i<d, 0≤j<μ]∈Fd×μ.
3.1 A faster tangling algorithm
This section shows that using the transpose of untangling allows us to
deduce an algorithm for tangling; see [33, 14] for a similar use of
transposition techniques. We start by describing useful subroutines.
3.1.1.
The first algorithmic result we will
need concerns the cost of inversion in F[x]/⟨Tμ⟩. To compute 1/FmodTμ for some F∈F[x] of
degree less than dμ we may start by computing Gˉ:=1/FˉmodT, with Fˉ:=FmodT; this costs O(M(dμ)+M(d)log(d)) operations in F. Then we lift Gˉ to G:=1/FmodTμ by Newton iteration modulo the powers of T, at the cost
of another O(M(dμ)).
3.1.2.
Next, we discuss the solution of certain Hankel systems.
Consider L and L′, two F-linear forms F[x]/⟨Tμ⟩→F; our goal is to find F in F[x]/⟨Tμ⟩ such that F⋅L=L′, under the assumption that L generates
the dual space F[x]/⟨Tμ⟩∗.
In matrix terms, this is equivalent to finding
coefficients f0,…,fdμ−1 of F such that [H][f0,…,fdμ−1]T=[B] with Hi,j=L(xi+j) and Bi=L′(xi), 0≤i<dμ.
The system can be solved in O(M(dμ)log(dμ))
operations in F [8], but we will derive an improvement
from the fact that Tμ is a μth power.
An algorithm that realizes the transposed product (L,F)↦L′ is in [6, Lemma 2.5]: let ζ:Fdμ→Fdμ be the upper triangular Hankel operator with first column
the coefficients of degree 1,…,dμ of Tμ, and let
Λ and Λ′ be the two polynomials in F[x] with respective
coefficients ζ(L) and ζ(L′). Then Λ′=FΛmodTμ.
Given the values of L and L′ at 1,…,xdμ−1, we
compute ζ(L) and ζ(L′) in O(M(dμ)) operations. Since
L generates F[x]/⟨Tμ⟩∗, Λ is invertible
modulo Tμ; then, using 3.1.1, we compute its
inverse in O(M(dμ)+M(d)log(d)) operations. Multiplication by
Λ′ takes another O(M(dμ)) operations, for a total of
O(M(dμ)+M(d)log(d)).
3.1.3.
We now recall van der Hoeven and Lecerf’s algorithm for the mapping
πT,μ, and deduce an algorithm for its transpose, with the same
asymptotic runtime. Van der Hoeven and Lecerf’s algorithm is
recursive, with a divide-and-conquer structure; the key idea is that
the coefficients of πT,μ(F), for F in F[x]/⟨Tμ⟩,
are the values of F,F′,…,F(μ−1) at α, divided
respectively by 0!,1!,…,(μ−1)!.
The runtime T(d,μ) of πrec satisfies T(d,μ)≤T(d,μ/2)+O(M(dμ)), so this results in an algorithm for πT,μ
that takes O(M(dμ)log(μ)) operations. Since πT,μ is an
F-linear mapping F[x]/⟨Tμ⟩→K[ξ]/⟨ξμ⟩, its transpose πT,μ⊥ is an F-linear mapping
K[ξ]/⟨ξμ⟩∗→F[x]/⟨Tμ⟩∗. The transposition principle implies that πT,μ⊥ can
be computed in O(M(dμ)log(μ)) operations; we make the
corresponding algorithm explicit as follows.
We transpose all steps of the algorithm above, in reverse order. As
input we take ℓ∈K[ξ]/⟨ξμ⟩∗, which we
see as a bidimensional vector in Fd×μ; we also
write ℓ=[ℓi∣0≤i<μ], with all ℓi in Fd.
The transpose of the concatenation at the last step allows one
to apply the two recursive calls to the first and second halves of
input ℓ. Each of them is followed by an application of the
transpose of Euclidean division (see below), and after
“transpose differentiating” the second intermediate result (see
below), we return their sum.
Correctness follows from the correctness of van der Hoeven and
Lecerf’s algorithm. Following [7], given a vector u, a polynomial S∈F[x] and
an integer t≥deg(S), where u has length deg(S),
mod⊥(u,S,t) returns the first t terms of the sequence
defined by initial conditions u and minimal polynomial S in time O(M(t)).
Given a vector u of
length t−λ, v:=diff⊥(u,λ) is the vector of
length t given by v0=⋯=vλ−1=0 and
vi=i⋯(i−λ+1)ui−λ for i=λ,…,t−1.
It can be computed in linear time O(t).
Overall, as in [22], the runtime is O(M(dμ)log(μ)).
3.1.4.
We can now give our algorithm for the tangling operator πT,μ−1;
it is inspired by a similar result due to
Shoup [33].
Take G in K[ξ]/⟨ξμ⟩: we want to find F∈F[x]/⟨Tμ⟩ such that πT,μ(F)=G. Let ℓ:K[ξ]/⟨ξμ⟩→F be defined by
ℓ(αd−1ξμ−1)=1 and ℓ(αiξj)=0 for all
other values of i<d,j<μ; as pointed out in 2.3, this
is a generator of K[ξ]/⟨ξμ⟩∗. Define further
ℓ′:=G⋅ℓ. Then ℓ′ is a transposed product as
in 2.3, and we saw that it can be computed in
O(M(dμ)) operations. This implies πT,μ(F)⋅ℓ=ℓ′.
Let now L:=πT,μ⊥(ℓ) and L′:=πT,μ⊥(ℓ′); we obtain them by
applying our transpose untangling algorithm to ℓ, resp.
ℓ′, in time O(M(dμ)log(μ)+M(d)log(d)). Since ℓ
is a generator of K[ξ]/⟨ξμ⟩∗, L is a
generator of F[x]/⟨Tμ⟩∗. The equation
πT,μ(F)⋅ℓ=ℓ′ then implies that F⋅L=L′, which
is an instance of the problem discussed in 3.1.2;
applying the algorithm there takes another O(M(dμ)+M(d)log(d)). Summing all costs, this gives an algorithm for
πT,μ−1 with cost O(M(dμ)log(μ)+M(d)log(d)),
proving Proposition 3.1.
3.2 Applications
3.2.1.
For P in F[x] one can compute xDmodP using O(log(D)) multiplications modulo P by repeated
squaring.
Applications include Fiduccia’s algorithm for the computation of terms
in linearly recurrent sequences [16] or
of high powers of matrices
[31, 19]. This algorithm takes
O(M(n)log(D)) operations in F, with n:=deg(P). We
assume without loss of generality that D≥n.
We can do better, in cases where P is not
squarefree. For computations of terms in recurrent sequences, such
P’s appear when computing terms of bivariate recurrent
sequences (ai,j) defined by ∑i,jai,jxiyj=N(x,y)/Q(x,y), for some polynomials N,Q∈F[x,y] with Q(0,0)=0. Then, the j-th row ∑iai,jxi has
characteristic polynomial Pj, where P is the reverse polynomial
of Q(x,0) [5].
First, assume that P=Tμ with T separable of degree d. Then
we compute xDmodP by tangling r:=(ξ+α)D. The quantity r=∑i=0μ−1(iD)ξiαD−i can be computed in time
O(M(d)(log(D)+μ)), by computing
αD−μ+1,αD−μ+2,…,αD and multiplying
them by the binomial coefficients (which themselves are obtained by
using the recurrence they satisfy).
By Proposition 3.1, the cost of tangling is O(M(dμ)log(μ)+M(d)log(d)), which brings the total to
O(M(d)log(D)+M(dμ)log(μ)), since d≤D. To
compute xD modulo an arbitrary P, one may compute the squarefree
decomposition of P, apply the previous algorithm modulo each factor
and obtain the result by applying the Chinese Remainder Theorem. The
overall runtime becomes O(M(m)log(D)+M(n)log(n)), where n
and m are the degrees of P and its squarefree part, respectively;
this is to be compared with the cost O(M(n)log(D)) of repeated
squaring.
While this algorithm improves over the
direct approach, practical gains show up only for astronomical values of
the parameters.
3.2.2.
Assume F=Q. In [26], Lebreton, Mehrabi and Schost gave
an algorithm to compute the intersection of surfaces in 3d-space, that
is, to solve polynomial systems of the form F(x1,x2,x3)=G(x1,x2,x3)=0. Assuming that the ideal K:=⟨F,G⟩⊂Q(x1)[x2,x3] is radical and that we are in generic
coordinates, the output is polynomials S,T,U in Q[x1,x2] such
that K is equal to ⟨S,Ux3−T⟩ (so S describes
the projection of the common zeros of F and G on the
x1,x2-plane, and T and U allow us to recover x3). The
algorithm of [26] is Monte Carlo, with runtime O(D4.7)
where D is an upper bound on deg(F) and deg(G). The output
has Θ(D4) terms in the worst case, and the result
in [26] is the best to date.
The case of non-radical systems was discussed in [29]. It was
pointed out in the introduction of that paper that quasi-linear time
algorithms for untangling and tangling (which were not explicitly
called by these names) would make it possible to extend the results
of [26] to general systems. Hence, already with the results
by van der Hoeven and Lecerf a runtime O(D4.7) was made possible
for the problem of surface intersection, without a radicality assumption.
4 The bivariate case
We now generalize the previous questions to the bivariate
setting. We expect several of these ideas to carry over to higher
numbers of variables, but some adaptations may be non-trivial (for
instance, we rely on Lazard’s structure theorem on lexicographic
bivariate Gröbner bases). As an application, we give results on the
complexity of arithmetic modulo certain primary ideals.
4.1 Setup
4.1.1.
For the rest of the paper, the degree deg(I) of a
zero-dimensional ideal I in F[x1,x2] is defined as the
dimension of F[x1,x2]/I as a vector space (the same definition
will hold for polynomials over any field).
Let m be a maximal ideal of degree d in F[x1,x2]; we
consider two new variables y1,y2, we let γ:F[x1,x2]→F[y1,y2] be the K-algebra isomorphism mapping (x1,x2) to
(y1,y2) and let m~:=γ(m). This is a maximal
ideal as well, and K:=F[y1,y2]/m~ is a field
extension of degree d of F. We then let α1,α2 be the
respective residue classes of y1,y2 in K.
Next, let J⊂K[ξ1,ξ2], for two new variables
ξ1,ξ2, be a zero-dimensional primary ideal at
α:=(α1,α2). Finally, let I:=Φ−1(J), where
Φ is the natural embedding F[x1,x2]→K[ξ1,ξ2]
given by (x1,x2)↦(ξ1,ξ2). One easily checks that I
is m-primary (that is, m is the radical of I), and
that J is the primary component at α of the ideal I⋅K[ξ1,ξ2] generated by Φ(I). Note that since F is
perfect, F→K is separable, so
over an algebraic closure F of F, m
has d distinct solutions.
We make the following assumption:
H2.
F has characteristic at least n, with n:=deg(I).
Finally, we let J′⊂K[ξ1,ξ2] be the ideal obtained by
applying the translation (ξ1,ξ2)↦(ξ1+α1,ξ2+α2) to J; it is primary at (0,0).
4.1.2.
Although our construction starts from the datum of m
and J⊂K[ξ1,ξ2] and defines I from them, we may also
take as starting points m and an m-primary ideal I⊂F[x1,x2] (this is what we did for the example in the
introduction).
Under that point of view, consider the ideal I⋅K[ξ1,ξ2]
generated by Φ(I), for Φ:F[x1,x2]→K[ξ1,ξ2] as
above, and let J be the primary component of I⋅K[ξ1,ξ2] at α. One verifies that I is equal to
Φ−1(J), so we are indeed in the same situation as
in 4.1.1.
4.1.3.
For the rest of the paper, we use the lexicographic monomial
ordering in F[x1,x2] induced by x1<x2, and its analogue in
K[ξ1,ξ2]; “the” Gröbner basis of an ideal is its minimal
reduced Gröbner basis for this order. Our first goal in this section is then to
describe the relation between the Gröbner bases of I and J:
viz., they have the same number of polynomials, and their leading
terms are related in a simple fashion (as seen on the example above).
Let T be the Gröbner basis of m. Since m is maximal,
T consists of two polynomials (T1,T2), with T1 of degree d1
in F[x1] and T2 in F[x1,x2], monic of degree d2 in
x2. Note that d1d2=d=deg(m).
Next, let H=(H1,…,Ht) be the Gröbner basis of J, with
H1<⋯<Ht; we let
ξ1μ1ξ2ν1,…,ξ1μtξ2νt be the
respective leading terms of H1,…,Ht. Thus, the μi’s are
decreasing, the νi’s are increasing, and ν1=μt=0.
Finally, we let μ:=deg(J)=deg(J′). Remark that the Gröbner
basis of J′ admits the same leading terms as H.
In our example, we have t=3, (μ1,ν1)=(2,0),
(μ2,ν2)=(1,1) and (μ3,ν3)=(0,2). The integers d1,d2
are respectively 2 and 1, so d=2, the degree n is 6 and the
multiplicity μ is 3. The key result in this subsection is
the following.
Proposition 4.1**.**
The Gröbner basis of I has the form (R1,…,Rt), where for
j=1,…,t, Rj=T1μjR~j, for some polynomial
R~j∈F[x1,x2] monic of degree d2νj in x2.
In particular, n=dμ.
As a result, for all j the leading term of Rj is x1d1μjx2d2νj, whereas that of Hj is ξ1μjξ2νj, as in our example. The next two
sub-sections are devoted to the proof of this proposition.
4.1.4.
We define here a family
of polynomials G1,…,Gt, and prove that they form a
(non-reduced) Gröbner basis of I in 4.1.5.
Because the extension F→K is separable, it admits a primitive
element β, with minimal polynomial F∈F[t]; this
polynomial has degree [K:F]=d. Let L be a splitting field
for F containing K and let I⋅L[ξ1,ξ2] and K be
the extensions of I⋅K[ξ1,ξ2] and J in
L[ξ1,ξ2], respectively. Then deg(J)=deg(K), and K is the
primary component of I⋅L[ξ1,ξ2] at α.
Let β1=β,β2,…,βd be the roots of F in L.
For all i=1,…,d, we let σi be an element in the Galois
group of L/F such that βi=σi(β), as well as
α(i):=(σi(α1),σi(α2)). Note that
these elements are pairwise distinct: since β is in
F[α1,α2] and all σi’s fix F,
α(i)=α(j) implies βi=βj, and thus
i=j. Therefore, α(1),…,α(d) can be seen as all the
roots of m, with α(1)=α.
For i=1,…,d, let Ki be the primary component of I⋅L[ξ1,ξ2] at α(i), so that K1=K. By
construction, these ideals are pairwise coprime, and their product is
I⋅L[ξ1,ξ2]. Take i in 1,…,d, and let D be a
large enough integer such that K=I⋅L[ξ1,ξ2]+nD and Ki=I⋅L[ξ1,ξ2]+niD, with
n and ni the maximal ideals at α and
α(i) respectively. Since I⋅L[ξ1,ξ2] is defined over F,
σi thus maps the generators of K to those of Ki. This
implies that the Gröbner basis of Ki is
(Hi,1,…,Hi,t), with Hi,j:=σi(Hj) for all
j≤t.
By definition of the integers d1,d2, we can partition the roots
{α(1),…,α(d)} of m according to their
first coordinate, into d1 classes C1,…,Cd1 of cardinality d2
each: for κ≤d1, all α(i) in Cκ have the
same first coordinate, say ζκ, and the ζκ’s
are pairwise distinct. Remark that ζ1,…,ζd1 are the
roots of T1.
Fix κ≤d1 and take i such that α(i) is in
Cκ. Because Ki is primary at α, Lazard’s structure
theorem on bivariate lexicographic Gröbner bases [24]
implies that for j=1,…,t, Hi,j=(ξ1−ζκ)μjH~i,j, for some polynomial
H~i,j∈L[ξ1,ξ2], monic of degree νj in
ξ2, and of degree less than μ1−μj in ξ1.
For 1≤κ≤d1 and 1≤j≤t, let us then define
G~κ,j:=∏iH~i,j, where the product is
taken over all i such that α(i)∈Cκ. This is a
polynomial in L[ξ1,ξ2], with leading term ξ2d2νj. Finally, let G~1:=1, and for 2≤j≤t let
G~j be the unique polynomial in L[ξ1,ξ2] of degree
less than d1(μ1−μj) in ξ1 such that G~jmod(ξ1−ζκ)μ1−μj=G~κ,j holds for all
κ≤d1. We claim that (G1,…,Gt), with Gj:=T1μjG~j for all j, is a Gröbner basis of I⋅L[ξ1,ξ2], minimal but not necessarily reduced.
4.1.5.
To establish this claim, we first prove that I⋅L[ξ1,ξ2]=⟨G1,…,Gt⟩ in
L[ξ1,ξ2]. The first step is to determine the common zeros of
G1,…,Gt. Since G1=T1μ1, the ξ1-coordinates
of the solutions are the roots {ζ1,…,ζd1} of
T1. Fix κ≤d1, and let (ζκ,η) be a root
of G1,…,Gt. In particular, Gt(ζκ,η)=G~t(ζκ,η)=0. This implies that G~κ,t(ζκ,η)=0, so there exists i≤d such
that (ζκ,η)=α(i). Conversely, any
α(i) cancels G1,…,Gt, so that the zero-sets of
G1,…,Gt and I⋅L[ξ1,ξ2] are equal. Next, we
determine the primary component Qi of ⟨G1,…,Gt⟩ at a given α(i).
Take such an index i, and assume that α(i) is in
Cκ, for some κ≤d1 (so the first coordinate of
α(i) is ζκ). Take D large enough, so that
D≥μ1 and (ξ1−ζκ)D belongs to Qi; hence
Qi is also the primary component of the ideal ⟨G1,…,Gt,(ξ1−ζκ)D⟩ at α(i). This
ideal is generated by the polynomials (ξ1−ζκ)μ1
and (ξ1−ζκ)μjG~j, for 2≤j≤t.
For such j, since G~jmod(ξ1−ζκ)μ1−μj=G~κ,j, we get
that (ξ1−ζκ)μjG~jmod(ξ1−ζκ)μ1=(ξ1−ζκ)μjG~κ,j. As a result, the ideal above also admits the generators
(ξ1−ζκ)μ1,(ξ1−ζκ)μ2G~κ,2,…,G~κ,t. Now, recall that G~κ,j=∏ιH~ι,j, where the product is
taken over all ι such that α(ι) is in
Cκ. For ι=i, H~ι,j does not vanish at
α(i) [24, Th. 2.(i)], so it is invertible locally
at α(i). It follows that the primary component of G at
α(i) is generated by
(ξ1−ζκ)μ1,(ξ1−ζκ)μ2H~i,2,…,H~i,t, that is, Hi,1,…,Hi,t. This
is precisely the ideal Ki.
To summarize, ⟨G1,…,Gt⟩ and I⋅L[ξ1,ξ2] have the same primary components K1,…,Kd, so
these ideals coincide. It remains to prove that (G1,…,Gt) is a
Gröbner basis of I⋅L[ξ1,ξ2]. The shape of the leading
terms of G1,…,Gt implies that number of monomials reduced with
respect to these polynomials is ddeg(J)=dμ. Now, since all its
primary components Ki have degree μ=deg(J), the ideal I⋅L[ξ1,ξ2]=⟨G1,…,Gt⟩ has degree dμ as
well. As a result, G1,…,Gt form a Gröbner basis (since
otherwise, applying the Buchberger algorithm to them would yield fewer
reduced monomials, a contradiction).
The polynomials G1,…,Gt are a Gröbner basis, minimal,
as can be seen from their leading terms, but not reduced; we
let R1,…,Rt be the corresponding reduced minimal Gröbner
basis. For all j, T1μj divides Gj, and we obtain Rj
by reducing Gj by multiples of T1μj, so that each Rj
is a multiple of T1μj as well. In addition, the leading
terms of Gj and Rj are the same. Hence, our proposition is
proved.
4.1.6.
As a corollary, the following proposition and its proof
extend [22, Lemma 9] to bivariate contexts. We will still use
the names untangling and tangling for πm,J′ as
defined below and its inverse.
Proposition 4.2**.**
Assume m is a maximal ideal in F[x1,x2] and I is an
m-primary zero-dimensional ideal in F[x1,x2], with F
perfect of characteristic at least deg(I).
Let m~ be the
image of m through the isomorphism F[x1,x2]≃F[y1,y2], let α1,α2 be the residue classes of
y1,y2 in K:=F[y1,y2]/m~ and let J be the
primary component of I⋅K[ξ1,ξ2] at
(α1,α2). Finally, let J′ be the image of J through
(ξ1,ξ2)↦(ξ1+α1,ξ2+α2). Then,
there exists an F-algebra isomorphism
[TABLE]
given by (x1,x2)↦(ξ1+α1,ξ2+α2)
Proof.
We prove that the embedding Φ:F[x1,x2]→K[ξ1,ξ2]
given by (x1,x2)↦(ξ1,ξ2) induces an isomorphism of
F-algebras F[x1,x2]/I→K[ξ1,ξ2]/J. From
this, applying the change of variables (ξ1,ξ2)↦(ξ1+α1,ξ2+α2) gives the result.
Since Φ(I) is contained in J, the embedding Φ induces an
homomorphism ϕ:F[x1,x2]/I→K[ξ1,ξ2]/J. By the
previous proposition, both sides have dimension dμ over
F, so it is enough to prove that ϕ is injective. But this
amounts to verifying that Φ−1(J)=I, which is true by definition.
∎
4.2 Untangling for monomial ideals
4.2.1.
In this section, we give an algorithm for the mapping
πm,J′ of Proposition 4.2 under a simplifying
assumption. To state it, recall that J′ is maximal at (0,0)∈K2. Then, our assumption is
H3.
J′ is a monomial ideal.
In view of the shape of the leading terms given in 4.1.3
for the ideal J, we deduce that J′=⟨ξ1μ1,ξ1μ2ξ2ν2,…,ξ2νt⟩. In the rest of this subsection, B is the
monomial basis of F[x1,x2]/I induced by the Gröbner basis
exhibited in Proposition 4.1 and B′ is the
monomial basis of K[ξ1,ξ2]/J′. Then, the inputs of the
algorithms in this subsection are in SpanFB:=⊕b∈BFb, and the outputs in SpanKB′:=⊕b′∈B′Kb′. This
being said, our result is the following.
Proposition 4.3**.**
Under H2 and H3, given F in
F[x1,x2]/I one can compute πm,J′(F) using either O(M(dn))
or O(M(μn)log(μ)) operations in F, and
in particular in O(M(n1.5)log(n)) operations.
We prove the first two bounds in 4.2.2
and 4.2.3 respectively. The last statement readily follows, since
n=dμ (Proposition 4.1).
4.2.2.
We start with an efficient algorithm for those cases
where d=[K:F] is small. The idea is simple: as in the
univariate case, the untangling mapping πm,J′ can be rephrased in
terms of Taylor expansion. Explicitly, for F in F[x1,x2]/I,
πm,J′(F) is simply
[TABLE]
We compute F(ξ1+α1,ξ2+α2), proceeding
one variable at a time.
Step 1. Compute
F∗:=F(ξ1+α1,ξ2)∈K[ξ1,ξ2]. Because
2,…,n are units in F, given a univariate polynomial P of
degree t≤n in K[ξ1] one can compute P(ξ1+α1)
in O(M(t)) operations (+,×) in K (see [1]).
Using Kronecker substitution [17, Chapter 8.4], this
translates to O(M(dt)) operations in F (we will systematically
use such techniques, see e.g. Lemma 2.2 in [18] for details).
Computing F∗ is done by
applying this procedure coefficient-wise with respect to ξ2; in
particular, all ξ1-degrees involved are at most n, and add up
to n. The super-linearity of M implies that this takes a total
of O(M(dn)) operations in F.
Step 2. Compute
F∗(ξ1,ξ2+α2)=F(ξ1+α1,ξ2+α2). This is
done in the same manner, applying the translation with respect to
ξ2 instead; the runtime is still O(M(dn)) operations in F.
Step 3. Since F is in SpanFB, and B is stable by division,
F(ξ1+α1,ξ2+α2) are in SpanKB:=⊕b∈BKb. By
Proposition 4.1, all monomials in B′ are in
B, so we can obtain πm,J′(F) by discarding
from F(ξ1+α1,ξ2+α2) all monomials not in
B′.
Overall, the runtime is O(M(dn)) operations in F. For small d,
when the multiplicity μ is large, this is close to being
linear in n=deg(I).
4.2.3.
Next we give an another solution, which will perform well in
cases where the multiplicity μ=deg(J′) is small.
Again the idea is simple: given F in SpanFB,
compute F(ξ1+α1,ξ2+α2)mod⟨ξ1μ1,ξ2νt⟩, and again discard unwanted
terms (this is correct, since all coefficients of πm,J′(F)
are among those we compute). As in the previous paragraph, this is
done one variable at a time; in the following, recall that m=⟨T1(x1),T2(x1,x2)⟩, with deg(T1,x1)=d1 and
deg(T2,x2)=d2, so that d1d2=d=deg(m). Also, we let
K′ be the subfield F[y1]/⟨T1(y1)⟩ of K, so
that K=K′[y2]/⟨T2(α1,y2)⟩; we have
[K′:F]=d1 and [K:K′]=d2.
Step 1. By Proposition 4.1, we
can write F=∑0≤i<d2νtFi(x1)x2i, with all
Fi’s of degree at most d1μ1. Compute all
Fi∗:=πT1,μ1(Fi)∈K′[ξ1]/⟨ξ1μ1⟩, so as to obtain G:=∑0≤i<d2νtFi∗x2i. The cost of this step is O(d2νtM(d1μ1)log(μ1)) operations in F. Since νtμ1≤μ2 and d1d2μ=dμ=n, with n=deg(I), this is
O(M(μn)log(μ)).
Step 2. Rewrite G as G=∑i<μ1Gi(x2)ξ1i, with all Gi’s in K′[x2] of degree
at most d2νt. Compute all Gi∗:=πT2,νt(Gi)∈K[ξ2]/⟨ξ2νt⟩.
To compute the Gi∗’s, we apply the univariate untangling algorithm
with coefficients in K′ instead of F.
The runtime of this second step is O(μ1M(d2νt)log(νt))
operations (+,×) in K′, which becomes O(μ1M(d1d2νt)log(νt)) operations in F, once we use Kronecker
substitution to do arithmetic in K′. As for the first step, this is
O(M(μn)log(μ)) operations in F.
Step 3.
At this stage, we have ∑i<d2νtGi∗ξ1i∈K[ξ2]/⟨ξ1μ1,ξ2νt⟩=F(ξ1+α1,ξ2+α2)mod⟨ξ1μ1,ξ2νt⟩. Discard all monomials lying in J′ and
return the result – this involves no arithmetic operation.
On our example, the untangling
algorithm would pass from an ideal in x1,x2 (figure (a) below) to
the monomial ideal ⟨ξ12,ξ22⟩ (step 2,
figure (b) below) then the monomial ξ1ξ2 would be discarded to
get a result defined modulo J′=⟨ξ12,ξ1ξ2,ξ22⟩ (step 3, figure (c) below).
x_{1}^{4}$$x_{1}^{2}x_{2}$$x_{2}^{2}$$(\mu_{3},\nu_{3})=(0,2)$$(\mu_{1},\nu_{1})=(2,0)$$1$$1$$(\mu_{3},\nu_{3})=(0,2)$$(\mu_{2},\nu_{2})=(1,1)$$(\mu_{1},\nu_{1})=(2,0)
4.3 Recursive tangling for monomial ideals
The ideas used to perform univariate tangling, that is, to invert
πT,μ, carry over to bivariate situations. In this section, we
discuss the first of them, namely, a bivariate version of van der
Hoeven and Lecerf’s recursive algorithm. We still work under
assumption H3 that J′ is a monomial ideal. As before,
B is the monomial basis of F[x1,x2]/I induced by the
Gröbner basis exhibited in Proposition 4.1.
Proposition 4.4**.**
Under H2 and H3, given G in
K[ξ1,ξ2]/J′ one can compute πm,J′−1(G)
using either O(M(dn)log(n)+M(n)log(n)2), or O(M(μn)log(n)2) operations in F. In particular, this can be done
in O(M(n1.5)log(n)2) operations.
As in [22], our procedure is recursive; the recursion here is
based on the integer μ1. Given G in K[ξ1,ξ2]/J′, we
explain how to find F in F[x1,x2]/I such that
πm,J′(F)=G, starting from the case μ1=1.
4.3.1.
If μ1=1, the ideal J′ is of the form
⟨ξ1,ξ2ν2⟩, and πm,J′ maps
F(x1,x2) to G:=F(α1,ξ2+α2)modξ2ν2.
In this case, note that the degree n of I is simply d1d2ν2.
Step 1. Apply our univariate tangling
algorithm to G in the variable x2 to compute F(α1,x2):=πT2,ν2−1(G)∈K′[x2]/⟨T2μ2⟩,
working over the field K′=F[y1]/⟨T1(y1)⟩ instead of
F.
This takes O(M(d2ν2)log(ν2)+M(d2)log(d2)) operations
(+,×) in K′, together with O(d2) inversions in
K′. Using Kronecker substitution for multiplications, this results
in a total of O(M(d1d2ν2)log(ν2)+M(d1d2)log(d1d2)) operations in F. We will use the simplified upper
bound O(M(d1d2ν2)log(d1d2ν2))=O(M(n)log(n)).
Step 2. The polynomial F has degree
less than d1 in x1 and d2ν2 in x2; for such F’s,
knowing F(α1,x2)∈K′[x2]/⟨T2μ2⟩ is
equivalent to knowing F(x1,x2) in F[x1,x2]. Thus, we are
done.
4.3.2.
Assume now that μ1>1, let G be in K[ξ1,ξ2]/J′ and
let μˉ:=⌈μ1/2⌉. The following steps closely
mirror Algorithm 9 in [22]. For the cost analysis, we let
S(m,J′) be the cost of applying
πm,J′ (see Proposition 4.3) and
T(m,J′) be the cost of the recursive algorithm for
πm,J′−1.
Step 1.
Let Gˉ:=Gmodξ1μˉ, and compute
recursively Fˉ:=πm,J0′−1(Gˉ), with J0′:=J′+⟨ξ1μˉ⟩. This costs T(m,J0′).
Step 2. Compute H:=(G−πm,J′(Fˉ)) div ξ1μˉ, where the div operator
maps ξ1i to [math] for i<μˉ and to
ξ1i−μˉ otherwise. This costs S(m,J′).
Step 3.
Define W:=ξ1/πm,J′(T1)∈K[ξ1,ξ2]/⟨ξ1μ1,ξ2μ2⟩. Because T1(α1)=0
and T1′(α1)=0 (by our separability assumption), W is
well-defined. This costs S(m,J′) for πm,J′(T1)
and O(M(d1μ1)) for inversion (since it involves ξ1 only),
which is O(M(n)).
Step 4. Compute recursively Eˉ:=πm,J1′−1(WμˉHmodJ1′), where
J1′ is the colon ideal J′:ξ1μˉ. Since W depends only on ξ1, a
multiplication by W, or one of its powers, is done coefficient-wise
in ξ2, for O(M(n)) operations in F. Thus, the cost to
compute WμˉHmodJ1′ is O(M(n)log(n)); to this, we
add T(m,J1′).
Step 5. Return F:=Fˉ+T1μˉEˉ.
The product T1μˉEˉ requires no reduction,
since all its terms are in B. Proceeding coefficient-wise
with respect to x2, and using super-additivity, it
costs O(M(n)).
On our example, we have J′=⟨ξ12,ξ1ξ2,ξ22⟩ (a), Step 1 uses J0′=⟨ξ1,ξ22⟩ (b) and Steps 2-5 work on the colon ideal J1′=⟨ξ1,ξ2⟩ (c).
(\xi_{1}^{0},\xi_{2}^{2})$$(\xi_{1}^{1},\xi_{2}^{1})$$(\xi_{1}^{2},\xi_{2}^{0})$$(\xi_{1}^{0},\xi_{2}^{2})$$(\xi_{1}^{1},\xi_{2}^{0})$$(\xi_{1}^{1},\xi_{2}^{1})$$(\xi_{1}^{2},\xi_{2}^{0})
Let us justify that this algorithm is correct, by computing
πm,J′(F), which is equal to πm,J′(Fˉ)+πm,J′(T1)μˉπm,J′(Eˉ)modJ′. Note first that πm,J′(Fˉ)modξ1μˉ=Gmodξ1μˉ. Equivalently, πm,J′(Fˉ)=Gmodξ1μˉ+ξ1μˉ(πm,J′(Fˉ) div ξ1μˉ). Using the definition of H, this
is also Gmodξ1μˉ+ξ1μˉ(G div ξ1μˉ−H), that
is, G−ξ1μˉH.
On the other hand, by definition of Eˉ, we have
[TABLE]
so that
πm,J′(Eˉ)modJ1′=WμˉHmodJ1′. Now, πm,J′(T1) is a multiple of ξ1, so
πm,J′(T1)μˉ is a multiple of ξ1μˉ. Since ξ1μˉJ1′ is in J′, we deduce
that πm,J′(T1)μˉπm,J′(Eˉ)modJ′ is equal to πm,J′(T1)μˉWμˉHmodJ′,
and thus to ξ1μˉH. Adding the two intermediate results so far,
we deduce that πm,J′(F)=G, as claimed.
Finally, we do the cost analysis. The runtime T(m,J′) satisfies the recurrence
relation
[TABLE]
Using 4.3.1 and the super-linearity of M, we
see that the total cost at the leaves is O(M(n)log(n)).
Without loss of generality, we can assume that S(m,J′) is
super-linear, in the sense that
S(m,J0′)+S(m,J1′)≤S(m,J′) holds at every level of the recursion.
Since the recursion has depth O(log(n)), we get that
T(m,J′) is in O(S(m,J′)log(n)+M(n)log(n)2).
4.4 Tangling for monomial ideals using duality
We finally present a bivariate analogue of the algorithm
introduced in Section 3. Since the runtimes obtained
are in general worse than those in the previous subsection, we
only sketch the construction.
All notation being as before, let G be in K[ξ1,ξ2]/J′, and
let F∈F[x1,x2]/I be such that πm,J′(F)=G. Following ideas
from [30], we now use several linear forms. Thus, let
ℓ1,…,ℓγ be module generators of
(K[ξ1,ξ2]/J′)∗, where the ∗ means that we look at
the dual of K[ξ1,ξ2]/J′ as an F-vector space. Define \ell^{\prime}_{1}:=\mbox{G\cdot\ell_{1}},\dots,\ell^{\prime}_{\gamma}:=G\cdot\ell_{\gamma}, as well as
[TABLE]
in (F[x1,x2]/I)∗. As in the one variable case, for i=1,…,γ
the relation πm,J′(F)⋅ℓi=ℓi′ implies that F⋅Li=Li′.
The first question is to determine suitable
ℓ1,…,ℓγ. Consider generators
ξ1μ1ξ2ν1,…,ξ1μtξ2νt of J′,
with the μi’s decreasing and νi’s increasing as before. For
i=1,…,t−1, define ℓi by ℓi(α1d1−1α2d2−1ξ1μi−1ξ2νi+1−1)=1, all
other ℓi(α1e1α2e2ξ1r1ξir2) being set to zero. Then, following
e.g. [15, Section 21.1], one verifies that these
linear forms are module generators of (K[ξ1,ξ2]/J′)∗.
As in the univariate case, we can compute all Li and Li′ by
transposing the untangling algorithm, incurring O(t) times the cost
reported in Proposition 4.4. Then, it remains to solve all
equations F⋅Li=Li′, i=1,…,t−1 (this
system is not square, unless t=2).
We are not aware of a quasi-linear time algorithm to solve such
systems. The matrix of an equation such as F⋅Li=Li′ is
sometimes called multi-Hankel [4]. It can be solved using structured linear algebra
techniques [4] (Here, we have several such systems to solve at once;
this can be dealt with as in [11]). As
in [4], using the results from [6] on
structured linear system solving, we can find F in Monte
Carlo time O((st)ω−1M(tn)log(tn)), with s:=min(μ1,νt), where ω is the exponent of linear algebra (the best
value to date is ω≤2.38 [12, 25]).
Thus, unless both s and t are small, the overhead induced by the
linear algebra phase may make this solution inferior to the one in
the previous subsection.
4.5 An Application
To conclude, we describe a direct application of our results to the
complexity of multiplication and inverse in A:=F[x1,x2]/I: under
assumptions H2 and H3, both can be done in the time
reported in Proposition 4.4, to which we add O(M(n)log(n)3) in the case of inversion. Even though the algorithms are
not quasi-linear time in the worst case, to our knowledge no previous
non-trivial algorithm was known for such operations.
The algorithms are simple: untangle the input, do the multiplication,
resp. inversion, in A′:=K[ξ1,ξ2]/J′, and tangle the result.
The cost of tangling dominates that of untangling. The appendix below
discusses the cost of arithmetic in A′: multiplication and inverse
take respectively O(M(μ)log(μ)) and O(M(μ)log(μ)2)
operations (+,−,×) in K, plus one inverse in K for the
latter. Using Kronecker substitution, the runtimes become
O(M(n)log(n)) and O(M(n)log(n)2) operations in K, with n=deg(I); this is thus negligible in front of the cost for tangling.
Appendix: Bivariate power series arithmetic
We prove that for a field F and zero-dimensional monomial ideal I⊂F[x1,x2], multiplication and inversion in F[x1,x2]/I
can be done in softly linear time in δ:=deg(I),
starting with multiplication.
For an ideal such as I=⟨x1μ,x2ν⟩, the
claim is clear. Indeed, to multiply elements F and G of
F[x1,x2]/I we multiply them as bivariate polynomials and discard
unwanted terms. Bivariate multiplication in partial degrees less than
μ, resp. ν, can be done by Kronecker substitution in time
O(M(μν))=O(M(δ)), which is softly linear in δ, as
claimed. However, this direct approach does not perform well for cases
such as I=⟨x1μ,x1x2,x2ν⟩: in this case,
for F and G reduced modulo I, the product FG as polynomials
has μν terms, but δ=μ+ν−1. The following result
shows that, in general, we can obtain a cost almost as good as in the
first case, up to a logarithmic factor. Whether this extra factor can
be removed is unclear to us. In the rest of this appendix, we write I=⟨x1μ1x2ν1,x1μ2x2ν2,…,x1μtx2νt⟩, with μi’s decreasing, νi’s
increasing and ν1=μt=0.
Proposition 4.5**.**
Let I be a zero-dimensional monomial ideal in F[x1,x2]
of degree δ. Given F,G reduced modulo I, one can compute
FGmodI in O(M(δ)log(δ)) operations (+,−,×) in F.
A.1. We start by giving an algorithm
of complexity O(tM(δ)) for multiplication modulo I.
Let F and G be two polynomials reduced modulo I. To compute H:=FGmodI it suffices to compute Hi:=FGmod⟨x1μi,x2νi+1⟩ for i=1,…,t−1; all monomials in H
appear in one of the Hi’s (some of them in several Hi’s). We
saw that multiplication modulo ⟨x1μi,x2νi+1⟩ takes O(M(μiνi+1)) operations in
F, which is O(M(δ)), so the total cost is O(tM(δ)).
A.2.
In the general case, define
i1:=1. We let i2≤t be the smallest index greater than i1
and such that μi2<μi1/2, and iterate the process to
define a sequence i1=1<i2<⋯<is=t. The ideal I′ is
then defined by the monomials
x1μi1x2νi1,…,x1μisx2νis.
By construction, I contains I′; hence, to compute a product
modulo I, we may compute it modulo I′ and discard unwanted terms.
Multiplication modulo I′ is done using the algorithm of A.1,
in time O(sM(δ′)), with δ′:=deg(I′). Hence, we need
to estimate the degree δ′ of I′, as well as its number of
generators s.
The degree δ of I can be written as ∑r=1s−1∑i=irir+1−1μi(νi+1−νi); this is simply
counting the number of standard monomials along the rows. For a given
r, all indices i in the inner sum are such that μi≥μir/2, so the sum is at least 1/2∑r=1s−1μir(νir+1−νir), which is the degree of I′.
Hence, δ≥1/2δ′, that is, δ′≤2δ. To
estimate the number s, the inequalities μir+1<μir/2 for all r≤s imply that μis−1<μ1/2s. We deduce that 2s≤μ1/μis−1≤μ1
(since μis−1≥1), which itself is at most δ.
Thus, s∈O(log(δ)). Overall, the cost of multiplication modulo
I′, and thus modulo I, is O(M(δ)log(δ)).
Corollary 4.6**.**
For I as in the previous proposition and F reduced modulo I,
with F(0,0)=0, 1/FmodI can be computed in
O(M(δ)log(δ)2) operations (+,−,×) in F,
and one inverse.
A.3. We proceed by
induction using Newton iteration. If μ1=1 then I=⟨x1,x2ν2⟩, so inversion modulo I is
inversion in F[x2]/⟨x2ν2⟩. It can be done in
time O(M(δ)) using univariate Newton iteration, involving only
the inversion of the constant term of the input.
Otherwise, define μˉ:=⌈μ1/2⌉, and let Iˉ be the ideal
with generators x1μˉ,x1μ2x2ν2,…,x2νt (all monomials in this list with μi≥μˉ may be
discarded). Given F in F[x1,x2]/I, we start by computing the
inverse of Gˉ of Fˉ:=FmodIˉ in F[x1,x2]/Iˉ. Since Iˉ2
is contained in I, knowing Gˉ, one step of Newton iteration allows
us to compute G:=1/FmodI as G=2Gˉ−Gˉ2FmodI.
Using the previous proposition, we deduce G from Gˉ in
O(M(δ)log(δ)) operations. We repeat the
recursion for O(log(δ)) steps, and the degrees of the ideals
we consider decrease, so the overall runtime is
O(M(δ)log(δ)2).