This paper provides a new proof of the Pederson-Roy-Szpirglas theorem by connecting counting real solutions of polynomial equations to invariants of trace forms on finite algebras over real closed fields.
Contribution
It offers a novel proof of a counting theorem using linear algebra and algebraic invariants, linking algebraic geometry and form theory.
Findings
01
Proof of Pederson-Roy-Szpirglas theorem using trace form signatures
02
Establishes equality between rational points count and trace form signature
03
Connects algebraic invariants with real zero counting
Abstract
The main goal of this article is to provide a proof of the Pederson-Roy-Szpirglas theorem about counting common real zeros of real polynomial equations by using basic results from Linear algebra and Commutative algebra. The main tools are symmetric bilinear forms, Hermitian forms, trace forms, and their invariants such as rank, types, and signatures. Further, we use the equality (proved in [3]) of the number of K-rational points of a zero-dimensional affine algebraic set over a real closed field K with the signature of the trace form of its coordinate ring to prove the Pederson-Roy-Szpirglas theorem, see [16].
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Algebra and Geometry · Advanced Differential Equations and Dynamical Systems · Algebraic Geometry and Number Theory
Full text
Rational points and trace forms on a finite algebra over
a real closed field†
Dilip P. Patil* 1* ∗
1 Department of Mathematics, Indian Institute of Science Bangalore
The main goal of this article is to provide a proof of the Pederson-Roy-Szpirglas theorem about counting common real zeros of real polynomial equations by using basic results from Linear algebra and Commutative algebra.
The main tools are symmetric bilinear forms, Hermitian forms, trace forms and their invariants such as rank, types and signatures. Further, we use
the equality (proved in [3]) of the number of K-rational points of a zero-dimensional affine algebraic set over a real closed field K with the signature of the trace form of its coordinate ring to prove the Pederson-Roy-Szpirglas theorem, see [16].
Key words and phrases:
Real closed fields, Finite K-algebra, Hermitian forms, Quadratic forms, Sylvester’s law of inertia, Type, Signature, Trace forms.
2010 Mathematics Subject Classification:
Primary 13-02, 13B22, 13F30, 13H15, 14C17
† This expository article on the Pederson-Roy-Szpirglas theorem about counting real roots of real polynomial equations. Most of the exposition is influenced by the discussions of the first author with late Prof. Dr. Uwe Storch (1940-2017) and the lecture courses delivered by him.
∗ During the preparation of this work the first author was visiting Department of Mathematics, Indian Institute of Technology Bombay. He would like to express his gratitude for the generous financial support from IIT Bombay and encouraging cooperation. He was also partially supported by his project MATRICS-DSTO-1983 during the final preparation of this manuscript.
1. Introduction
The objective of this paper is to present an exposition of classical and modern results con- cerning the number of real or complex points in the solution space of a finite system of polynomial equations with real coefficients in arbitrary number of variables. More precisely, for polynomials F1,…,Fm∈\mathdsR[X1,…,Xn], assume that the residue-class \mathdsR-algebra \mathdsR[X1,…,Xn]/⟨F1,…,Fm⟩ is finite dimensional over \mathdsR, then the set of common zeros
V\mathdsR(F1,…,Fm):={(a1,…,an)∈\mathdsRn∣Fj(a1,…,an)=0 for all j=1,…,m}
of F1,…,Fm in \mathdsRn is finite. The converse is not true, for example, for F1=X12+1, V\mathdsR(F1)=∅ is finite and \mathds{R}[X_{1},\ldots,X_{n}]/\langle F_{1}\rangle\!\!\stackrel{{\scriptstyle\raise 1.0pt\hbox{\mathchoice{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}}}}{{\longrightarrow}}\!\!\mathds{C}[X_{2},\ldots,X_{n}] is not finite dimensional over \mathdsR if n≥2. However,
for polynomials F1,…,Fm∈\mathdsC[X1,…,Xn], the residue-class \mathdsC-algebra \mathdsC[X1,…,Xn]/⟨F1,…,Fm⟩ is finite dimensional over \mathdsC if and only if the set of common zeros
V\mathdsC(F1,…,Fm) of F1,…,Fm in \mathdsCn is finite.
Moreover, by the classical Hilbert’s nullstellensatz V\mathdsC(F1,…,Fm)=∅ if and only if the ideal ⟨F1,…,Fm⟩ generated by F1,…,Fm in \mathdsC[X1,…,Xn] is a non-unit ideal. But, this is not true over the field \mathdsR or more generally over real closed fields. Therefore
the natural questions one deals with are when exactly VK(F1,…,Fm)=∅ and how to find its cardinality, where K is an arbitrary real closed field.
Many researchers have studied these problems and devised effective algorithms. For example, already in the 19th century Sturm, Jacobi, Sylvester, Hermite, Hurwitz proved fundamental results for counting real points (in small number of variables n≤2) by using the signature of appropriate quadratic forms.
In Section 2 and Section 3, we collect standard results on symmetric bilinear and Hermitian forms over a real closed field K and its algebraic closure \mathdsCK=K[i] with i2=−1. However, for the sake of completeness, we recall them without proofs in the format they are used in later sections. With these preliminaries at the end od Section 3, we state the important Rigidity theorem for quadratic forms (see [3]) which is used in Section 4.
In Section 4, we collect some elementary concepts from commutative algebra and
recall the important Theorem 4.5 from [3] which relates the K-rational points of a finite dimensional algebra A over a real closed field K with the type of the trace form TrKA on A and derive some consequences.
In Section 5, we compute the cardinality of the K-rational points of finite algebra over real closed field K. The main ingredient in this section is the Shape Lemma 5.3 which guarantees a distinguished generating set for a [math]-dimensional radical ideal A⊆K[X1,…,Xn] from which one can reduce the problem of counting the number of K-rational points in VK(A) to the one variable case.
In Theorem 5.5 using the results from Section 4, we relate type, signature and rank of a generalized trace forms on A=K[X1,…,Xn]/A with the number of points in VK(A) and in
VK(A). Finally, we give a proof of theorem of Pederson-Roy-Szpirglas [16, Theorem 2.1].
2. Decomposition theorem for Hermitian forms
The main aim of this section is to recall the Decomposition Theorem (see 2.13) which guarantees the existence of orthogonal bases (with respect to Hermitian forms). For this we recall basic concepts and steps which lead to its proof. Most of these results can be found in standard graduate text books, for instance see [18, Ch. V, §12], [17, Ch. IX] or [9, Ch. 11], [1, Ch. 7], or [14, Ch. XV]. However, for setting the notation, terminology and for the sake of completeness, we recall them without proofs in the format that they are used in this as well as in the later sections.
2.1 Notation and Assumptions
In order to define symmetric and Hermitian forms together and prove results about them, we fix the following convenient notation :
*Let K be a field and let κ:K→K be a fixed involution (an automorphism whose square is the identity, i .e. its inverse is itself). We denote by K′:=Kκ:={a∈K∣κ(a)=a}⊆K the fixed field of K. There are exactly two cases : (i) κ=idK and (ii) κ=idK. In this first case, we assume that CharK=2.
The involution κ of K is simply denoted by the standard bar-notation κ:K→K, a↦a and called the conjugation of K.
Therefore we have : a+b=a+b, ab=ab and a=a for all a,b∈K. Furthermore, the fixed field K′=K in the first case and K′⊊K in the second case.*
2.2 Examples
(1)
For an arbitrary field K, the identity map idK:K→K is an involution. For K=\mathdsR, the identity id\mathdsR is the only involution. For K=\mathdsC, besides the identity id\mathdsC, the usual complex conjugation \mathdsC→\mathdsC, z↦z, is the only other involution of \mathdsC which play an important role 111 In the case, V=W=\mathdsC, the distance of a point z from the origin is not given by the bilinear form (z,w)↦z⋅w, but by using the map (z,w)↦z⋅w, namely, ∣z∣=zz..
(2)
*The complex-conjugation is a special case of the conjugation of a quadratic algebra A over an arbitrary field K : If 1,ω∈A is a K-basis of A with ω2=α+βω, α, β∈K, then the conjugation A→A of A is defined by a+bω=(a+bβ)−bω, a, b∈K. It is easy to see this is an involution of the K-algebra A and is not equal to idA.
For an arbitrary element x∈A, the norm, the trace and the characteristic polynomial of x over K are defined by the equations : NKA(x)=xx, TrKA(x)=x+x, χx=X2−(x+x)X+xx=(X−x)(X−x), respectively.
*
There are many examples of this type, for example,
if L is a field and if κ∈AutL is an involution of L with κ=idL, then L is a quadratic algebra over the fixed field K:=Lκ:={a∈L∣κ(a)=a} and the involution κ of L coincides with the conjugation of the quadratic algebra over K and the Galois group Gal(L∣Lκ)=AutLκ-algL={idL,κ}. A typical example of this type is the algebraic closure \mathdsCK=K[i], where i2=−1, of a real closed field K.*
2.3 Definitions
Let V and W be K-vector spaces.
(1)
A map f:V→W of K-vector spaces is called semilinear (or conjugate-linear) (with respect to the conjugation of K) if f is additive and f(ax)=ax for all a∈K and all x∈V. The semilinear maps from V into W coincide with the K-linear maps from V into the anti-vector spaceW corresponding to W and also with the K-linear maps from the K-vector space V into W, where W (resp. V) is a K-vector spaces with the scalar multiplication (a,y)↦ay defined by using the given scalar multiplication on W (resp. V).
(2)
A function Φ:V×W→K is called sesquilinear if Φ is K-linear in the first component and semilinear (with respect to the conjugation of K) in the second component, i. e. if for all a, a′∈K and all x, x′∈V, y, y′∈W, we have :
(a)
Φ(ax+a′x′,y)=aΦ(x,y)+a′Φ(x′,y).
(b)
Φ(x,ay+a′y′)=aΦ(x,y)+a′Φ(x,y′).
The set of sesquilinear functions V×W→K is denoted by SesqK(V,W) which is clearly a subspace of the K-vector space KV×W. If V=W, then sesquilinear functions are also called sesquilinear forms on V. Note that if the conjugation of K is equal to idK, then the sesquilinear functions are linear in both variables, i. e. they are bilinear and hence SesqK(V,W)=MultK(V,W) ( = the set of bilinear functions).
The bijective map SesqK(V,W)⟶(V⊗KW)∗, Φ⟼(x⊗y↦Φ(x,y)) is an isomorphism of K-vector spaces (with inverse φ⟼((x,y)↦φ(x⊗y)).
2.4
Gram’s Matrix*
Let V, W be finite dimensional K-vector spaces with bases \mathbcalx:={xi∣i∈I}, I finite indexed set, \mathbcaly:={yj∣j∈J}, J finite indexed set, respectively.
Then every sesquilinear function Φ:V×W→K is uniquely determined by the values Φ(xi,yj), (i,j)∈I×J. Conversely, for arbitrary family cij∈K, (i,j)∈I×J, there is a (unique) sesquilinear function Φ:V×W→K defined by Φ(∑i∈Iaixi,∑j∈Jbjyj):=∑(i,j)∈I×Jaibjcij. Moreover, the map*
[TABLE]
*is an isomorphism of K-vector spaces.
For a sesquilinear function Φ:V×W⟶K the I×J-matrix
is called the Gram’s matrix
or the fundamental matrix
of Φ with respect to the bases \mathbcalx and \mathbcaly. If V=W and \mathbcalx=\mathbcaly, then we simply write GΦ(\mathbcalx). Further, if I=J, then the determinant GΦ(\mathbcalx,\mathbcaly):=DetGΦ(\mathbcalx;\mathbcaly) is called the Gram’s determinant with respect to the bases \mathbcalx and \mathbcaly. If V=W and if yj=xj for all j∈J=I, we simply write GΦ(\mathbcalx).*
*For the computation with Gram’s matrices, it is convenient to extend the conjugation of K to matrices
over K :
For a matrix A=(aij)∈MI,J(K) with I, J finite indexed sets, put A:=(aij)∈MI,J(K). Then
the map MI,J(K)→MI,J(K),
A↦A, is a semilinear (with respect to the conjugation of K) involution of the K-vector space MI,J(K).
Further, tA=tA
(where for a I×J-matrix A∈MI,J(K), tA denote the transpose of A)
and AB=AB if B∈MJR(K), R finite indexed set.
For a square matrix A∈MI(K), DetA=DetA and if A∈GLI(K), then A−1=A−1.
2.4.1
Let \mathbcalx′=(xi′)i∈I, \mathbcaly′=(yj′)j∈J be another K-bases of V, W and A=(ari)∈GLI(K), B=(bsj)∈GLJ(K) be the transition matrices of the bases from \mathbcalx to \mathbcalx′, from \mathbcaly to \mathbcaly′, respectively. Then for a sesquilinear function Φ:V×W→K, we have the transformation formula :
In particular, if V=W, then
GΦ(\mathbcalx)=tAGΦ(\mathbcalx′)A.*
2.5 Examples
(1)
(Standard forms*) The standard form on the standard K-vector space K(I), I an indexed set, with the standard basis ei,i∈I of K(I) is the
sesquiliner form defined by the unit matrix EI∈MI(K)
and is denoted by ⟨−,−⟩, that is,
⟨ei,ej⟩=δij for all i,j∈I.
Therefore
\,\bigr{\langle}(a_{i})\,,(b_{i})\,\bigr{\rangle}=\sum_{i\in I}a_{i}\,\overline{b}_{i}={}^{\rm t}{\mathscr{A}}\,\mathscr{B}\,, where A:=(ai),B:=(bi)∈K(I) (are column vectors).
In particular, if I={1,…,n}, then ⟨(a1,…,an),(b1,…,bn)⟩=a1b1+⋯+anbn.*
(2)
(Natural Duality)*
Let K be an arbitrary field with idK as conjugation, V a K-vector space and let
V∗=HomK(V,K) denote the dual space of V.
The canonical evaluation map E:V×V∗⟶K, (x,f)⟼⟨x,f⟩:=f(x),
is a bilinear and is called the
natural duality between V and V∗. If V is finite dimensional with basis
\mathbcalx={x1,…,xn}, and if \mathbcalx∗={x1∗,…,xn∗} is the corresponding dual basis, then the Gram’s matrix of this natural duality GE(\mathbcalx,\mathbcalx∗)=En is the unit matrix in Mn(K).*
2.6
Non-degeneracy and Complete Duality*
An important motivation for the study of sesquilinear functions is the description of linear form through vectors. (See Example 2.9).*
*Let V and W be K-vector spaces and let Φ:V×W⟶K be a sesquilinear
function. The canonical semilinear maps defined by :
Φ1:V⟶W∗x⟼(y↦Φ(x,y))andΦ2:W⟶V∗y⟼(x↦Φ(x,y)),
are simple denoted by Φ1(x)=Φ(x,−) and Φ2(y)=Φ(−,y).*
Further, from each one of the map Φ1 resp. Φ2, one can recover Φ, since
Φ(x,y)=(Φ1(x))(y)=(Φ2(y))(y) for all x∈V and all y∈W.
2.6*.1 Suppose that both V, W are finite dimensional over K with bases \mathbcalx=(xi)i∈I, \mathbcaly=(yj)j∈J, respectively. Then the matrices of the canonical semilinear maps
Φ1 and Φ2 with respect to bases \mathbcalx, \mathbcaly∗ and \mathbcaly, \mathbcalx∗, where \mathbcalx∗ and \mathbcaly∗ are dual bases of \mathbcalx and \mathbcaly, respectively, are given by :
Further, since taking the transpose and conjugation, the rank of a matrix is unaltered, both Φ1 and Φ2 have the same rank. This common rank of the maps Φ1, Φ2 is called the rank of the sesquilinear function Φ and is denoted by rankΦ. Therefore, rankΦ is the rank of the Gram’s matrix of Φ with respect to arbitrary bases of V and W.*
The case when Φ1 and Φ2 are both injective or both bijective are important :
2.7 Definition
*Let V and W be K-vector spaces and let Φ:V×W⟶K be a sesquilinear function. We say that
(1) Φ is non-degenerate if Φ1 and Φ2 are both injective.
(2) Φ defines a complete duality (between V and W) if Φ1 and Φ2 are both bijective.*
2.8 Example
(Trace form)*
Let V be a finite dimensional K-vector space. The map EndKV×EndKV, (f,g)⟼Tr(fg), is a symmetric bilinear form on the K-vector space EndKV of K-endomorphisms of V and is called the trace form on EndKV.*
2.8* 1
Let V be a finite dimensional K-vector space. Then the trace form defines a complete duality on EndKV.*
Let A be a finite (dimensional) K-algebra. For an element x∈A, λx:A→A denote the left multiplication map on A by x. Then the map A×A→K,
(x,y)↦TrKA(xy)=Tr(λxλy),
defines a symmetric bilinear form on A and is called the trace form of the K-algebra A.
The trace form on A reflects many important properties of the K-algebra A, see Section 4.
Note that if A=EndKV , then the trace form on the K-algebra EndKV is different from the above introduced trace form on the K-vector space EndKV. Obviously, for every endomorphism f∈EndKV :
TrKEndKVf=n⋅Trf, n:=DimKV.
2.9 Example
*( Gradient of a linear form )
For a finite dimensional K-vector space V the natural duality between V and
V∗ (see Example 2.5 (2)) is a complete duality, since
its Gram’s matrix with respect to dual bases is the unit matrix. Further,
in this case Φ1:V→(V∗)∗ is the canonical evaluation map x↦Ex:f↦Ex(f):=f(x) and Φ2:V∗→V∗ is the identity map idV∗.
In particular, for every linear form φ:V∗→K, there exists a unique vector x∈V (which is independent on φ) such that φ(f)=f(x) for every f∈V∗, i. e. φ is the evaluation of the linear forms in V∗ at the vector x∈V.
If V is not finite dimensional, then the natural duality is non-degenerate but never complete.
This follows from the fact that for every x∈V (also in the infinite dimensional case) can be extended to a basis V and hence there exists a linear form f∈V∗ with ⟨x,f⟩=f(x)=0.*
More generally, if Φ:V×W⟶K defines a complete duality, then one can use Φ1 and Φ2 to identify the K-vector spaces V and W∗, resp. W and V∗. Therefore, for every linear form f∈W∗, there exists a unique vector xf∈V with f=Φ(xf,−), and for every linear form φ∈V∗, there exists a unique vector yφ∈W with φ=Φ(−,yφ). The vectors xf resp. yφ are called the gradients of f resp. φ (with respect to Φ) and are denoted by gradf resp. gradφ. Therefore the linear forms on W resp. V correspond to their respective gradients, i. e. f=Φ1(gradf) and
φ=Φ2(gradφ).
2.10
Orthogonality, perpendicular relation and Hermitian forms* The concept of orthogonality has its origin in Euclidean geometry.*
*Let V and W be K-vector spaces and let Φ:V×W⟶K be a sesquilinear function.
(1) The vectors x∈V and y∈W are called orthogonal or perpendicular to each other with respect to Φ if Φ(x,y)=0. In this case we write x⊥Φy or simply x⊥y (if Φ is fixed).
(2) Two subsets M⊆V and N⊆W are called orthogonal
if x⊥y for all x∈M and for all y∈N. In this case we write M⊥N. Futher, we put
M⊥:={y∈W∣M⊥{y}}and⊥N:={x∈V∣{x}⊥N}.
Obviously, M⊥ and ⊥N are K-subspaces of W and V, resp.*
For example, if f∈W∗ is a linear form with a gradient (see Example 2.8)
gradf∈V, i. e. f(y)=Φ(gradf,y) for all y∈W, then
Kerf={gradf}⊥. Analogously, if a linear form φ∈V∗ with a gradient gradφ∈W, then Kerφ=⊥{gradφ}.
Note that if Φ:V×V⟶K is a sesquilinear form on V, then
for a subset M⊆V, the subsets M⊥={y∈V∣M⊥{y}} and ⊥M={x∈V∣{x}⊥M} are not equal in general, since the relation of perpendicularity is not symmetric.
To remove this difference, one considers the symmetric (resp. Hermitian, skew-Hermitian) forms if the conjugation of K (see 2.1) is idK (resp. =idK) :
2.10*.1 Definition Let V be a finite dimensional K-vector space and let Φ:V×V⟶K be a sesquilinear form on V, We say that Φ is Hermitian (resp. skew-Hermitian) if Φ(x,y)=Φ(y,x) (resp.
Φ(x,y)=−Φ(y,x)) for all x, y∈V.
With the notation and assumptions in 2.1, we note that the term “Hermitian ” and “skew-Hermitian” mean “symmetric” and “skew-symmetric” if the conjugation of K is idK. If the conjugation of K is =idK, then sometimes we use the terms “pure-Hermitian” and “pure-skew-Hermitian” forms on V. In the case K=\mathdsC with the usual complex-conjugation, the Hermitian (resp. skew-Hermitian) forme are simply called
complex-Hermitian (resp. complex-skew-Hermitian).*
2.10*.2 If Φ1:V⟶V∗ and Φ2:V⟶V∗ are the canonical semilinear maps associated to the sesquilinear form Φ on V, see 2.6, then Φ is Hermitian (resp. skew-Hermitian) if and only if Φ1=Φ2(resp. Φ1=−Φ2).
Further, for a Hermitian (resp. skew-Hermitian) form Φ on V, the relation ⊥ on V is symmetric. In this case the subspace ⊥V=V⊥=KerΦ1=KerΦ2 of V is called the degeneration space or, the radical of Φ and is also denoted by Rad(V,Φ)=RadΦ.*
2.10*.3
A sesquilinear form Φ:V×V→K on a finite dimensional K-vector space V is Hermitian (resp. skew-Hermitian) if and only if the Gram’s matrix GΦ(\mathbcalx)=(Φ(xi,xj))∈MI(K) of Φ with respect to every basis \mathbcalx={xi∣i∈I} of V is Hermitian (resp. skew-Hermitian).
Recall that a square matrix A∈MI(K), I finite indexed set, is Hermitian (resp. skew-Hermitian if A=tA (resp. A=−tA.
A matrix A∈MI(K) is symmetric (resp. skew-symmetric if A=tA (resp. A=−tA).*
2.10*.4 Let V be a finite dimensional K-vector space
and let \mathbcalx={xi∣i∈I} be a basis of V. (recall that K′ is the fixed field of K, see 2.1)
The map
Φ⟼GΦ(\mathbcalx) is a K′-linear isomorphism of the K′-vector space of Hermitian
(resp. skew-Hermitian) forms on V onto the
K′-vector space of Hermitian (resp. skew-Hermitian) matrices in MI(K).
Moreover, if
\mathbcalx′={xi′∣i∈I} is another basis of V with transition matrix A=(aij)∈GLI(K), i. e.xj=∑i∈Iaijxi′, then the Gram’s matrices GΦ(\mathbcalx) and GΦ(\mathbcalx′) are related by the rule :
GΦ(\mathbcalx)=tAGΦ(\mathbcalx′)A or
GΦ(\mathbcalx′)=tA−1GΦ(\mathbcalx)A−1.*
2.10*.5 In important cases a sesquilinear forms on a K-vector space V are completely determined by its values on the diagonal \,\Delta_{\,V}=\{(x,x)\bigm{|}x\in V\}. More precisely, we have :
2.10.5a Polarisation identity
Let V be a K-vector space. Then :
(1) If the conjugation (see 2.1) of K is =idK, then for every sesquilinear form Φ:V×V→K, for all x,y∈V and a∈K with a=a, (using Cramer’s rule) we have :
\Phi(x,y)\!=\!\textstyle{\frac{1}{a-\overline{a}}}\,\bigl{(}\Phi(ax+y\,,ax+y)\!-\!\overline{a}\,\Phi(x+y,x+y)\!-\!\overline{a}(a\!-\!1)\,\Phi(x,x)\!-\!(1\!-\!\overline{a})\,\Phi(y,y)\bigr{)} and
2.10*.5b Corollary
If the conjugation of K is =idK(see 2.1) and V is a K-vector space, then a sesquilinear form Φ:V×V→K on a K-vector space V is Hermitian (resp. skew-Hermitian) if and only if Φ(x,x)=Φ(x,x)(resp. Φ(x,x)=−Φ(x,x))
for all x∈V, i. e.Φ(x,x)∈K′(the fixed field of K with respect to the conjugation of K, see 2.1). In particular, a complex-sesquilinear form is complex-Hermitian (resp. complex-skew-Hermitian) if and only if the values the values Φ(x,x), x∈V, are all real (resp. purely-imaginary).*
2.10*.5c Corollary
Let V be vector space over the field K of characteristic =2. Then a symmetric bilinear form
Φ:V×V→K on V is the zero form if and only if Φ(x,x)=0 for all x∈V.*
2.11
Orthogonal direct sums*
Let Vi, i∈I; Wi, i∈I be two families of K-vector spaces and let Φi:Vi×Wi→K be a family of sesquilinear functions. Then the map*
[TABLE]
is a sesquiinear function and its restrictions Φ∣Vi×Wi=Φi for all i∈I, where Vi (resp. Wi) is considered canonically as subspace of ⨁i∈IVi (resp. ⨁i∈IWi). Further, Vi⊥Wj with respect to Φ for all i,j∈I, i=j. This sesquilinear function Φ is called the orthogonal direct sum of the family Φi, i∈I and is denoted by \,\displaystyle\mathbin{\hbox{\vrule width=0.0pt,height=5.83333pt,depth=0.83333pt\kern 0.44337pt\vrule height=2.62497pt,depth=-2.04169pt,width=6.59563pt\kern 0.77782pt\hbox to0.0pt{\hss\hbox{\vrule width=0.58336pt,depth=-2.33328pt,height=5.83333pt\kern 3.84996pt}}\hbox to0.0pt{\hss\hbox{\odot}}}}_{i\in I}\Phi_{i}.
Conversely, if Φ:V×W→K is a sesquilinear function and V(resp. W) is a direct sum of the K-subspaces Vi, i∈I (resp. Wi, i∈I) with Vi⊥Wj for all i, j∈I, i=j with respect to Φ, such that Φ(∑i∈Ivi,∑j∈Iwj)=∑i∈IΦ(vi,wi) for vi∈Vi, wj∈Wj. Then we say that Φ is the orthogonal direct sum of the Φi:=Φ∣Vi×Wi, i∈I. In particular, if V=W and Vi=Wi, i∈I, then V is the orthogonal direct sum of the subspaces Vi, i∈I, with respect to Φ and is denoted by V=\mathbin{\hbox{\vrule width=0.0pt,height=5.83333pt,depth=0.83333pt\kern 0.44337pt\vrule height=2.62497pt,depth=-2.04169pt,width=6.59563pt\kern 0.77782pt\hbox to0.0pt{\hss\hbox{\vrule width=0.58336pt,depth=-2.33328pt,height=5.83333pt\kern 3.84996pt}}\hbox to0.0pt{\hss\hbox{\odot}}}}_{i\in I}V_{i}.
2.12
Orthogonal basis*
Let V be a K-vector space and let Φ:V×V→K be a sesquilinear form on V.
A a family of vectors xi, i∈I, in V is called orthogonal with respect to Φ if xi⊥xj for all i, j∈I with i=j. Moreover, if Φ(xi,xi)=1 for all i∈I, then it is called orthonormal with respect to Φ.*
If xi, i∈I, is an orthogonal basis of V with respect to Φ, then
V is the orthogonal direct sum of the 1-dimensional subspaces Kxi, i∈I. Moreover, if I is finite, then the Gram’s matrix of Φ with respect to the basis xi, i∈I, is the diagonal matrix Diag(Φ(xi,xi))i∈I.
The following Decomposition Theorem 2.13 guarantees the existence of orthogonal bases (with respect to Hermitian forms) is the starting point for the classification of Hermitian forms :
2.13 Decomposition Theorem
*Let K be a field with notations and assumptions as in 2.1 and let Φ:V×V→K be a sesquilinear form on a finite dimensional K-vector space V. Then in each of the following cases :
(a) The conjugation of K(see 2.1) is =idK and Φ is Hermitian or skew-Hermitian,
(b)CharK=2 and Φ is a symmetric bilinear form,
V has an orthogonal basis \mathbcalx={x1,…,xn}, n=DimKV, with respect to Φ. In otherwords :V is the orthogonal direct sum V\!=\!\mathbin{\hbox{\vrule width=0.0pt,height=5.83333pt,depth=0.83333pt\kern 0.44337pt\vrule height=2.62497pt,depth=-2.04169pt,width=6.59563pt\kern 0.77782pt\hbox to0.0pt{\hss\hbox{\vrule width=0.58336pt,depth=-2.33328pt,height=5.83333pt\kern 3.84996pt}}\hbox to0.0pt{\hss\hbox{\odot}}}}_{i=1}^{n}Kx_{i} into 1-dimensional subspaces Kxi, i=1,…,n, with respect to Φ and \Phi=\mathbin{\hbox{\vrule width=0.0pt,height=5.83333pt,depth=0.83333pt\kern 0.44337pt\vrule height=2.62497pt,depth=-2.04169pt,width=6.59563pt\kern 0.77782pt\hbox to0.0pt{\hss\hbox{\vrule width=0.58336pt,depth=-2.33328pt,height=5.83333pt\kern 3.84996pt}}\hbox to0.0pt{\hss\hbox{\odot}}}}_{i=1}^{n}\Phi|Kx_{i} is the orthogonal direct sum of its restrictions Φ∣Kxi, i=1,…,n.
Matrix formulation : If either G∈MI(K), I finite indexed set, is a Hermitain or skew-Hermitian matrix, or if the conjugation is =idK, CharK=2 and G is a symmetric matrix, then there exists an invertible matrix A∈GLI(K) such that tAGA is a diagonal matrix.*
Proof
Use induction on DimKV, 2.10.5a, 2.10.5b and 2.10.5c.
2.14
Automorphisms and Congruence*
Let Φ:V×V→K and Ψ:W×W→K be sesquilinear forms on the K-vector spaces V and W, respectively. A map f:V→W is called a homomorphism
of (V,Φ) in (W,Ψ) if it is K-linear and is compatible with the forms Φ and Ψ, i. e.
Φ(x,y)=Ψ(f(x),f(y)) for all x,y∈V. A bijective homomorphism f:(V,Φ)→(W,Ψ) is called an isomorphism of (V,Φ) onto (W,Ψ).*
A homomorphism (V,Φ)→(V,Φ) of is called an endomorphism of (V,Φ) or of Φ. The set of endomorphisms EndK(V,Φ) (with composition) is a monoid. An isomorphism (V,Φ)→(V,Φ) is called an automorphism of (V,Φ) or of Φ. The set of automorphisms AutK(V,Φ) of (V,Φ) is the unit group of the monoid EndK(V,Φ) and is called the automorphism group of (V,Φ) or of Φ .
If there exists an isomorphism from (V,Φ) onto (W,Ψ), then (V,Φ) and (W,Ψ) or also the forms Φ and Ψ said to be congruent. If f:(V,Φ)→(W,Ψ) is an isomorphism, then the map AutΦ→AutΨ, g↦fgf−1 is an isomorphism of groups.
Two square matrices C, C′∈Mn(K) are said to be congruent if there exists an invertible matrix A∈GLn(K) with C=tAC′A.
On sesquilinear forms on finite dimensional K-vector spaces (resp. of square matrices over K) the relation of “being congruent” is an equivalence relation.
2.14*.1
Let V, W be finite dimensional K-vector spaces with DimKV=DimKW, \mathbcalx={xi∣i∈I}, \mathbcaly={yi∣i∈I} bases of V, W and let Φ, Ψ be sesquilinear forms on V, W, resp.
Further, let f:V→W be an K-isomorphism of vector spaces.
Then f is an isomorphism (V,Φ)→(W,Ψ) if and only if
GΦ(\mathbcalx)=tM\mathbcaly\mathbcalx(f)GΨ(\mathbcaly)M\mathbcaly\mathbcalx(f).
In particular, Φ and Ψ are congruent if and only if there exists
A∈GLI(K) with GΦ(\mathbcalx)=tAGΨ(\mathbcaly)A.*
2.14*.2 Corollary
Let Φ be a sesquilinear form on a finite dimensional K-vector space V with basis \mathbcalx={xi∣i∈I}. Then an automorphism f∈AutKV is an automorphism of (V,Φ) if and only if GΦ(\mathbcalx)=tM\mathbcalx\mathbcalx(f)GΦ(\mathbcalx)M\mathbcalx\mathbcalx(f).*
2.15
Classification problem for sesquilinear forms*
The classification problem for the sesquilinear forms on finite dimensional K-vector spaces is to find a
well arranged representative system for the equivalence classes of congruent sesquilinear forms.
For example, from the Decomposition Theorem 2.13 it follows immediately that :
2.15.1
Let K be a field with notation and assumptions as in 2.1. Then every pure-Hermitian or pure-skew-Hermitian (resp. if CharK=2, then every symmetric) matrix is congruent to a diagonal matrix Diag(ci)i∈I∈MI(K), I finite indexed set.
Moreover, the form KI×KI→K defined by (ei,ej)↦δijci, where ei, i∈I, is the standard basis of KI is also congruent to the form defined by
((ai),(bi))↦∑i∈Iaibici.
The above form defined in 2.15.1 (and every other form which is congruent to this form) defined by the diagonal matrix Diag(ci)i∈I∈MI(K) is denoted by [ci]i∈I and for I={1,…,n} also by [c1,…,cn].
The form [ci]i∈I is the orthogonal direct sum of the forms [ci]:(a,b)↦abci, i∈I, on K, therefore :
In general, it is difficult to classify the forms [ci]i∈I up to congruence. Obviously, the form [ci]i∈I is congruent to the form [ai2ci]i∈I, where ai∈K×, i∈I, since this is the transition of the basis ei, i∈I, to the basis aiei, i∈I. Therefore, one can replace the elements ci (if non-zero) by their images in the residue class group K×/N(K×) of K× modulo of the subgroup
N(K×):={aa∣a∈K×}. Note that N(K×)⊆K′×, where K′ is the fixed field of the conjugation of K (see 2.1) and if the conjugation is idK, then N(K×)=2K× is the subgroup of the quadratic-units in K.
If the form [ci]i∈I is Hermitian, then ci∈K′ for every i∈I and from the Decomposition Theorem 2.13, we have :
2.16 Theorem
Let K be a field with notation and assumptions as in 2.1 and
N(K×)=K′×, where K′ is the fixed field of the conjugation of K. Then every Hermitian form of rank r on an n-dimensional K-vector space is congruent to the form [1,…,1,0,…,0], where 1 occurs r times and [math] occurs n−r times. In particular, every non-degenerate Hermitian form on an n-dimensional K-vector space is congruent to the standard form [1,…,1] on Kn.*
2.17 Corollary
Let K be a field with CharK=2 and K=2K(e. g., if K is algebraically closed, K=\mathdsC). Then all symmetric matrices in Mn(K) of equal rank are congruent. The diagonal matrices
Diag(0,…,0), Diag(1,0,…,0), …,Diag(1,…,1)=En
form a complete representative system for the congruence classes of symmetric matrices over K.
3. Type and signature of Hermitian forms
In this section, we recall the classification of symmetric and Hermitian forms on finite dimensional vector spaces over a real closed field and its algebraic closure, up to congruence, see Definition 2.10.1.
Most of these results can be found in the graduate text books, [18, Ch. V, §12], [17, Ch. IX] or [9, Ch. 11], [1, Ch. 7], or [14, Ch. XV]. However, for the sake of completeness, we recall them without proofs.
3.1
Notation* (See also 2.1)
Let K be a real closed field 222Real closed fields A field K is called real closed if it is real, i. e.
if for all a1,…,an∈K, a12+⋯+an2=0 implies a1=⋯=an=0.
and if it has no nontrivial real algebraic extension L∣K,L=K. For example, the field \mathdsR of real numbers is real closed. The algebraic closure of \mathdsQ in \mathdsR is real closed. The field \mathdsQ is real, but not real closed. In 1927 Artin-Schreier proved : A field K is real if and only if there is an order ≤ on K such that (K,≤) is an ordered field. In particular, the characteristic of a real field is [math].
Theorem (Euler-Lagrange)
Let (K,≤) be an ordered field satisfying the properties : (i) Every polynomial f∈K[X] of odd degree has a zero in K. (ii) Every positive element in K is a square in K. Then the field K=K(i) obtained from K by adjoining a square root i of −1 is algebraically closed. In particular, K itself is real-closed. For a proof see [9, Ch. 11, §11.1].
(Remark : Since the field \mathdsR of real numbers is ordered and satisfies the properties (i) and (ii), the Eulae-Lagrange theorem proves the Fundamental Theorem of Algebra : The field \mathdsC=\mathdsR(i) of complex numbers is algebraically closed . The Euler-Lagrange Theorem has a remarkable complement : — Theorem (Artin - Schreier)
Let L be an algebraically closed field. If K⊆L be a subfield of L such that L∣K is finite and K=L, then L=K(i) with i2+1=0 and K is a real-closed field. For a proof see [9, Ch. 11, §11.7].). Then AutK={idK} and the field \mathdsCK:=K[i], where i2=−1, of (complex) numbers over K, is the algebraic closure of K with the Galois group Gal(\mathdsCK∣K)={id\mathdsCK,κ}, where κ:\mathdsCK→\mathdsCK, is the (complex)-conjugation defined by by i↦−i, see 2.1 and Examples 2.2.
Further, we denote by \mathdsK either the field K, or the field \mathdsCK. The involution in the case \mathdsK=K is idK and in the case \mathdsK=\mathdsCK is the (complex)-conjugation κ:\mathdsCK→\mathdsCK which we will simply denote by the standard bar-notation, i. e. a↦a, a∈\mathdsCK.
With these notation and assumptions, we note that the term “Hermitian ” means “real-symmetric” in the case \mathdsK=K and “complex-Hermitian” if \mathdsK=\mathdsCK.*
3.2 Proposition
*With the notation as in 3.1,
let Φ:V×V⟶\mathdsK be a Hermitian form on the finite dimensional \mathdsK-vector space V. Then there exists an orthogonal basis \mathbcalx={x1,…,xn}, n:=DimKV of V with respect to Φ such that the Gram’s matrix GΦ(\mathbcalx) is a diagonal matrix :
Below in 3.4 we note that that the numbers p and q are uniquely determined by Φ. Obviously, p+q=rankΦ. In particular, Φ is non-degenerate if and only if p+q=n. For characterization of invariants p and q the following concepts are useful :
3.3 Definition
Let Φ:V×V⟶\mathdsK be a Hermitian form on the finite dimensional \mathdsK-vector space V. Then Φ is called :
(1) positive definite if Φ(x,x)>0 for all x∈V, x=0.
(2) negative definite if Φ(x,x)<0 for all x∈V, x=0.
(3) positive semi - definite if Φ(x,x)≥0 for all x∈V.
(4) negative semi - definite if Φ(x,x)≤0 for all x∈V.
(5) indefinite if there are vectors x,y∈V with Φ(x,x)>0 and Φ(y,y)<0.
3.4
Sylvester’s Law of Inertia* Let Φ be a Hermitian form on the finite dimensional \mathdsK-vector space V and let \mathbcalx={x1,…,xn}, n:=Dim\mathdsKV be an orthogonal basis of V with respect to Φ such that the Gram’s matrix GΦ(\mathbcalx) of Φ is the diagonal matrix :
Then p is the maximum of the dimensions of subspaces of V on which Φ is positive definite, and q the maximum of the dimensions of subspaces of V on which Φ is negative definite. — In particular, p and q do not depend on the special choice of the orthogonal basis x1,…,xn of V.*
3.5 Definition
The pair (p,q) as in the Sylvester’s Law of Inertia 3.4 is called the type of the form Φ. The natural number p is called the (inertia -) index, the integer p−q is called the signature and the natural number q is called the Morse - index of Φ. We denote the rank, signature and type of a Hermitian form Φ by rankΦ, signΦ and typeΦ, resp.
The type of a Hermitian matrix C∈Mn(\mathdsK) is by definition the type a form with C as the Gram’s matrix with respect to an (arbitrary) \mathdsK-basis of \mathdsKn. The matrix analog of the Sylvester’s Law of Inertia 3.4 is the following :
3.6 Corollary
Let Φ be a Hermitian form on the n-dimensional \mathdsK-vector space V with \mathdsK-basis \mathbcalx={x1,…,xn}.
Then Φ is of type (p,q) if and only if the Gram’s matrix GΦ(\mathbcalx) is congruent to the matrix Enp,q, i. e. there exists an invertible matrix A∈GLn(\mathdsK) such that GΦ(\mathbcalx)=tAEnp,qA. Two Hermitian matrices C and C′∈Mn(\mathdsK) have the same type if and only if they are congruent. In particular, a Hermitian matrix C∈Mn(\mathdsK) have type (p,q) if and only if C is congruent to the matrix Enp,q.
If \mathdsK=K (real closed), then one can choose 333 Use the following observation : Let V be an oriented vector space over a real-closed field K of dimension n∈\mathdsN∗ and Φ be a Hermitian form of type (p,q) on V. Then there exists an orientation of V represented by a basis x1,…,xn of V such that the Gram’s matrix of Φ is equal to the matrix Enp,q.
A∈GLn+(K), i. e. DetA>0. In the situation of Corollary 3.6, if Φ is non-degenerate, i. e. if p+q=n, then DetGΦ(\mathbcalx)=(−1)q∣DetA∣2, i. e.
Sign(DetGΦ(\mathbcalx))=(−1)q. Therefore, the sign of the Gram’s determinant DetGΦ(\mathbcalx) determines the parity of q. From this the following useful criterion for the determination of the type follows :
3.7
Hurwitz’s Criterion*
Let Φ be a Hermitian form on the n-dimensional
\mathdsK-vector space V with basis \mathbcalx={x1,…,xn}.
Suppose that the principal minors*
[TABLE]
of the Gram’s matrix GΦ(\mathbcalx)=(Φ(xi,xj))∈Mn(K) of Φ
with respect to the basis \mathbcalx
are all =0. Then the type of Φ is (n−q,q), where q is the number of
sign changes 444 Recall that we say that in a sequence a0,…,an of non-zero real numbers changes the sign at the i-th place if 0≤i<n and aiai+1<0. — For an arbitrary sequence of real numbers b0,…,bm by a change of signs means a change of signs in the sequence obtained by removing the zeros from the original sequence.
in the sequence 1=D0,D1,…,Dn=DetGΦ(\mathbcalx).
3.8 Corollary
Let Φ be a Hermitian form on the n-dimensional
\mathdsK-vector space V with basis \mathbcalx={x1,…,xn} and let
[TABLE]
be the principal minors of the Gram’s matrix GΦ(\mathbcalx). Then :
(1)* Φ is positive definite if and only if Di>0 for all i=1,…,n.*
(2)Φ is negative definite if and only if (−1)iDi>0 for all i=1,…,n, i. e. at every position in the sequence D0,D1,…,Dn there is a sign change.
3.9 Example
*Let {v1,v2} be a basis of the 2-dimensional K-vector space V. For the symmetric bilinear form Φ=⟨−,−⟩ on V. Let D1=⟨v1,v1⟩ and
D2=Det(⟨v1,v1⟩⟨v2,v1⟩⟨v1,v2⟩⟨v2,v2⟩)=⟨v1,v1⟩⟨v2,v2⟩−∣⟨v1,v2⟩∣2.
Then the following table shows the dependence of the signD1, signD2 and the type of Φ :
Let z∈\mathdsCK∖K, π:=(X−z)(X−z)∈K[X], A:=K[X]/⟨π⟩]:=K[x], where x is the image of X modulo ⟨π⟩. Further, let H∈K[X], H∈⟨π⟩, h=h(x)∈A be the image of H in A and let Φh:A×A⟶K be the symmetric bilinear form defined by Φh(f,g)=TrKA(hfg), f, g∈A. Then the Gram’s matrix of Φh with respect to the basis {1,x}
[TABLE]
is a symmetric matrix with D1=h(z)+h(z)=2Reh(z) and D2=DetGΦh(1,x)=h(z)h(z)(z−z)2=−4∣h(z)∣2(Rez)2<0. Therefore, by the table in Example 3.10, the type of Φh is (1,1)
The type of a Hermitian form on a finite dimensional vector space V over \mathdsCK can also be determined by using the eigenvalues of the Gram’s matrix, see Theorem 3.12 below. Usual proofs given in the standard text books of this fact uses the Principal Axis Theorem for self-adjoint operators (also known ans the Spectral Theorem). We give here a direct proof using the following interesting Lemma 3.11 :
3.11 Lemma
Let K be a real closed field with notation as in 3.1, V an n-dimensional \mathdsCK-vector space V with a positive definite complex-Hermitian form Φ on V and let f:V→V. Then there exists an orthonormal basis \mathbcalx=(x1,…,xn) of V
such that the matrix M\mathbcalx\mathbcalx(f) of f with respect to \mathbcalx is an upper triangular matrix.
3.12 Theorem
Let K be a real closed field with notation as in 3.1 and let C∈Mn(\mathdsCK) be a Hermitian matrix. Then all the eigenvalues of C are in K and C is of type (p,q), where p is the number of positive eigenvalues and q is the number of negative eigenvalues of C counted with their multiplicities in the characteristic polynomial χC of C.
3.13 Remark
The proof of the above Theorem 3.12 also shows that every Hermitian matrix in Mn(\mathdsCK) is diagonalizable (even with respect to an orthonormal basis of \mathdsCKn).
3.14 Corollary
Let K be a real closed field with notation as in 3.1 and let C∈Mn(\mathdsCK) be a Hermitian matrix. Then the characteristic polynomial χC=c0+c1X+⋯+cn−1Xn−1+Xn∈K[X] and C is of type (p,q), where p is the number of sign changes in the sequence c0,c1,…,cn−1,cn:=1 and q is the
number of sign changes in the sequence c0,−c1,…,(−1)n−1cn−1,(−1)ncn=(−1)n. If c0=c1=⋯=cr−1=0 and cr=0, then p+q=n−r.
Proof
Note that, since all the eigenvalues of C are real by Theorem 3.12, indeed χA∈K[X]. The assertion is immediate from Theorem 3.12 and the following classical theorem of Descartes :
3.15 Theorem
(Descartes’ Rule of Signs)*
Let K be a real closed field with notation as in 3.1 and let
f=a0+a1X+⋯+an−1Xn−1+anXn∈K[X], an=0 be a polynomial of degree n. Further, let
V+, (resp. V−) denote the number of sign changes in the sequence a0,a1,…,an−1,an *(resp. in the sequence a0,−a1,…,(−1)n−1an−1,(−1)nan) and N+(resp. N−) denote the number of positive (resp. negative) zeros of f(each zero of f is counted with its multiplicity). Then there exist natural numbers r+ and r−∈\mathdsN such that
N+=V+−2r+ and N−=V−−2r−. Moreover, if all zeros of f belong to K, i. e. if f splits into linear factors in K[X], then N+=V+ and N−=V−.
We now recall (from [3]) that “being of type (p,q)” is an open property (with respect to the
*strong topology 555Strong topology Let K be a real closed field (see Footnote 2). Then
K is equipped with the order topology which is determined by the base of the open intervals ]a,b[, a,b∈K, a<b. The K-vector spaces Kn, n∈\mathdsN, are endowed with the product topology (with the base given by the open cuboids ]a1,b1[×⋯×]an,bn[, ai<bi, i=1,…,n). With the ordered and product topology, the addition, the multiplication and the inverse are continuous functions on K×K and K×=K∖{0}, respectively.
Further, polynomial functions (resp. rational functions F/G, F, G∈K[X1,…,Xn], G=0), in n variables are continuous K-valued functions on Kn (resp. on Kn∖VK(G), where VK(G):={a∈K∣G(a)=0} is a zero set of the denominator G which is closed in Kn.
The product topology on Kn transfers uniquely to every n-dimensional K-vector space by a K-linear isomorphism f:V→Kn. Any other isomorphism g:V→Kn defines the same topology, since gf−1:Kn→Kn and (gf−1)−1=fg−1:Kn→Kn are continuous (polynomial) maps. Therefore, polynomial and rational functions are also defined on any finite
dimensional vector space V by an isomorphism f:V→Kn.
This topology on V may be characterized as the smallest topology for which the K-linear functions V→K are continuous and is called the strong topology on V, since it is stronger than the Zariski topology on V if V=0.*
) which is an easy consequence of Hurwitz’s Criterion 3.7 :
3.16 Lemma
(cf. [3, Lemma 1.2]*
Let K be a real closed field with notations as in 3.1 and Fij∈K[T] be polynomials such that Fij=Fji, 1≤i,j≤n. Suppose that the bilinear form defined by the symmetric matrix (Fij(s))1≤i,j≤n∈Mn(K) at s∈K, is non-degenerate, then there exists an ε>0 such that the type of the symmetric matrices (Fij(t))1≤i,j≤n is the same for all t∈]s−ε,s+ε[. In particular, for non-degenerate symmetric bilinear forms over K, “being of type (p,q)” is an open property.*
We end this section by noting the following important Rigidity Theorem for symmetric bilinear forms (see [3] which is proved by using Hurwitz’s Criterion 3.7, the above Lemma 3.16 and the Intermediate Value Theorem for polynomial functions 666Intermediate Value Theorem for polynomial functionsLet K be a real closed field and F∈K[T] be a polynomial with coefficients in K such that F(a)F(b)<0 for some a,b∈K. Then F has a zero in [a,b]. In other words, the values F(t), t∈[a,b], have the same sign if F has no zero on [a,b]. In particular, every polynomial of odd degree has a zero in K. A field with this property is called a 2-field. Therefore, a real closed field is a 2-field. Furthermore, every monic polynomial F over a real closed field K has a positive zero in K if F(0)<0 (since F(x)>0 for “large” x).
.
3.17
Rigidity Theorem for Quadratic Forms* (cf. [3, 1.3])
Let K be a real closed field with notations as in 3.1 and let Rij(t)=Rij(t1,…,tn), 1≤i,j≤n, be rational functions on a line-connected 777Line-connected subsets Let V be a vector space over a real closed field K.
For two points x,y∈V, the subset [x,y]=[y,x]:={(1−t)x+ty∣t∈K,0≤t≤1}⊆V is called
the (closed) line-segment connecting x and y. For x0,…,xr∈V, r≥1, the subset [x1,…,xr]:=∪i=1r[xi−1,xi] is called the broken line from x0 to xr.
A subset V′⊆V is called line-connected if for any two points x,y∈V′ there is a broken line from x to y which lies entirely in V′.
Note that, if K=\mathdsR and U⊆V is open (in the strong topology, see Footnote 5), then the notion “line-connected” is equivalent to the topological notion of “connected”. The only topologically connected subspaces of K=\mathdsQ are the singletons. If V is a line, i. e. 1-dimensional, and if x∈V, then V∖{x} is not line-connected. However, if DimKV≥2, then V∖{x} is always
line-connected : If u,w∈V∖{x} are arbitrary points, there is always a point v∈V∖{x} such that [u,v,w]⊆V∖{x}.
subset U⊆Kn such that the matrices R(t)=(Rij(t))1≤i,j≤n∈Mn(K), t∈U, are symmetric, i. e.Rij=Rji for all 1≤i,j≤n with a DetR(t)=0 for all t∈U. Then all the matrices R(t)∈Mn(K), t∈U, have the same type (p,q), or equivalently, the same signature p−q.*
4. Trace forms and Rational points
In this section, we recall the results from [3] (based on the talk of Prof. U. Storch at IIT Bombay in November 2009) on trace forms, their invariants such as rank, type, signature and their relations with the number of rational points of a finite algebra A over a real closed field. For detailed proofs of these results the reader is recommended to see [3, § 3].
4.1
Preliminaries *
In this subsection, we recall the basic concepts from elementary commutative algebra (see [2], [12] and [15]).*
*Let A be an arbitrary commutative ring (with unity).
The set SpecA (resp. SpmA) of prime (resp. maximal) ideals in A is called the prime (resp. maximal) spectrum of A. The nil-radical nA:=0=∩p∈SpecAp is the intersection of all prime ideals in A.
More generally, (formal Nullstellensatz) A=∩p∈SpecA{p∣A⊆p} for every ideal A in A, see [2], [12].
The intersection MA:=∩M∈SpmAM of all maximal ideals in A is called the Jacobson radical of A.*
(a)
The K-Spectrum and the set of K-rational points of a K-algebra* (see [15])
Let K be a field. Using the universal property of the polynomial algebra K[X1,…,Xn], the affine space Kn can be identified with the set of K-algebra homomorphisms
HomK-alg(K[X1,…,Xn],K) by identifying the point a=(a1,…,an)∈Kn with the substitution homomorphism
ξa:K[X1,…,Xn]→K, Xi↦ai, whose kernel Kerξa is
the maximal ideal ma=⟨X1−a1,…,Xn−an⟩ in K[X1,…,Xn]. Moreover, every maximal ideal m in K[X1,…,Xn] with K[X1,…,Xn]/m=K is of the type ma for a unique point a=(a1,…,an)∈Kn; the component ai is determined by the congruence Xi≡aimodm.*
The subset K-SpecK[X1,…,Xn]:={ma∣a∈Kn} of SpmK[X1,…,Xn] is called the K-spectrum of
K[X1,…,Xn]. We have the identifications :
[TABLE]
*More generally, for any K-algebra A, the map *
HomK-alg(A,K)⟶{M∈SpmA∣A/M=K}, ξ⟼Kerξ, *
is bijective. Therefore we make the following definition :*
*For any K-algebra A of finite type, the subset
K-SpecA:={M∈SpmA∣A/M=K} is
called the K- spectrum of A and is denoted by K-SpecA. *
Further, if A\stackrel{{\scriptstyle\raise 1.0pt\hbox{\mathchoice{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}}}}{{\longrightarrow}}K[X_{1},\ldots,X_{n}]/\mathfrak{A} is a representation of the finite K-algebra A, then the K-algebraic set VK(A:={a∈Kn∣F(a)=0for allF∈A} defined by the ideal A is called the set of K- rational points of A.*
Under the above bijective maps, we have the identification VK(A)=HomK-alg(A,K)=K-SpecA.
For example, since \mathdsC is an algebraically closed field, Spm\mathdsC[X]=\mathdsC-Spec\mathdsC[X], but \mathdsR-Spec\mathdsR[X]⫋Spm\mathdsR[X]. In fact, the maximal ideal m:=⟨X2+1⟩∈Spm\mathdsR[X] does not belong to \mathdsR-Spec\mathdsR[X]. More generally, a field K is algebraically closed if and only if SpmK[X]=K-SpecK[X], see [2] , [12] or [6, Theorem 2.10 , HNS 3] )*
(b)
Local components of a finite algebra*
Let A be a finite algebra over a field K, i. e. A finite dimensional as a K-vector space of dimension DimKA. Then SpmA=SpecA (since any finite K-algebra which is an integral domain is already a field). Moreover, from the Chinese Remainder Theorem, it follows immediately that SpmA is a finite set.
In particular, #SpmA≤DimKA and equality holds if and only if A is isomorphic to the product K-algebra KDimKA ). *
Further, let SpmA={M1,…,Mr}. Then the unit group A× of A is A∖⋃i=1rMi and the canonical homomorphism A⟶∏i=1rAMi is injective (where Ap denotes the localization of A at a prime ideal p∈SpecA). In our special case, it is also surjective and hence an isomorphism, cf. [17, Corollary 55.16].
Therefore, A is the direct product of the local finite K-algebrasAi:=AMi, i=1,…,r, which are called the local components of A. Furthermore, we have :
DimKA=∑i=1rDimKAi=∑i=1rℓ(Ai)⋅[Ki:K],
where, for i=1,…,r, Ki=A/Mi is the residue class field of A at Mi and ℓ(Ai) the (finite) length of Ai, i. e. the length ℓ of a composition series 0=A0⊊A1⊊⋯⊊Aℓ=Ai with Aj+1/Aj≅A/Mi, j=1,…,ℓ−1.*
For example, if K is a 2-field, then [Ki:K] is even if Ki is a non-trivial field extension of K and, in particular, K-SpecA=∅ if DimKA is odd.*
Further, MA=M1∩⋯∩Mr=∩p∈SpecAp=nA and MA=nA=0, i. e.A is reduced if and only if A=K1×⋯×Kr is the product of its residue class fields. Moreover, if all the field extensions Ki of K are separable, then A is called a (finite) separableK-algebra.
4.2
The trace form*
Let A be a finite algebra over the field K. The trace form on A over K is the
symmetric K-bilinear form Tr:=TrKA:A×A→K, (f,g)↦TrKA(fg) on A. It is a classical tool used to study the K-algebra A.
The decomposition of
A=A1×⋯×Ar into its local components (cf. 4.1 (c)) yields the orthogonal decomposition (see Decompsition Theorem 2.13)
of the trace form. The degeneration space A⊥=A⊥Tr={f∈A∣Tr(Af)=0} is an ideal in A.*
4.3 Lemma
(cf. [3, Lemma 3.1])*
Let A be a finite algebra over an arbitrary field K
and let A⊥ be the degeneration space of the trace form TrKA. Then radical MA=nA⊆A⊥. Moreover, equality holds if and only if all the residue class fields of A are separable over K, i. e. if and only if the reduction Ared=A/nA is a separable K-algebra. — In particular, the trace form is non-degenerate if and only if A is a separable K-algebra.*
4.4 Corollary
Let A be a finite separable algebra over an arbitrary field K. Then
rankTrKA=DimK(A/MA)=i=1∑r[Ki:K].
Moreover, if K is an ordered field, then :
typeTrKA=i=1∑rtypeTrKKi* and signTrKA=i=1∑rsignTrKKi.*
Now, we state the following important and classical criterion for the existence of K-rational points for real closed fields which is proved in [3].
4.5 Theorem
(cf. [3, Theorem 3.2])*
Let A be a finite algebra over a real closed field K. Then :*
signTrKA=#K-SpecA.
In particular, K is a residue class field of A if and only if signTrKA=0.
4.6 Example
Let K be a real closed field. Then there is a unique (up to isomorphism) non-trivial finite field extension L∣K, namely, the quadratic field L=\mathdsCK=K[i] with i2=−1, of complex numbers over K (which is the algebraic closure of K). The Gram’s matrix of the trace form TrK\mathdsCK of \mathdsCK over K with respect to the basis 1, i is the matrix
[TABLE]
Therefore typeTrK\mathdsCK=(1,1) and signTrK\mathdsCK=0.
4.7 Corollary
Let A be a finite algebra over a real closed field K. Then the trace form TrKA is positive definite if and only if A is separable over K and A splits over K, i. e. there exists an isomorphism of K-algebras A\stackrel{{\scriptstyle\raise 1.0pt\hbox{\mathchoice{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}}}}{{\longrightarrow}}K^{\,\operatorname{Dim}_{K}A}.
4.8 Corollary
Let K be a real closed field and f∈K[X] be a monic polynomial. Then all zeros of f (in K) belong to K and are simple if and only if the trace form TrKA of the K-algebra A:=K[X]/⟨f⟩ is positive definite.
For a partial generalization (see Theorem 4.10 below) of Theorem 4.5 and applications one can also consider the following more general trace forms :
4.9
Generalized trace forms *
Let
SymK(V,K):={Φ∈MultK(V,K)∣Φis symmetric}
be the K-vector space of all symmetric bilinear forms on V and
consider a K-linear embedding
The elements of the image of this map are called generalized trace forms on A.
The A-module EA∣K ( with the scalar multiplication (gα)(f):=α(fg) for α∈E, g, f∈A) is called the dualizing module of A.
Therefore :
4.9***.1*** Φα(f,g)=(gα)(f)=(fα)(g) and the degeneration space A⊥α of Φα is the largest ideal of A contained in Kerα.
4.9***.2*** Let α:A/A⊥α→K be the linear form on A:=A/A⊥α induced by α, then
rankΦα=rankΦα and the induced bilinear form Φα is non-degenerate on A.
4.9***.3*** Moreover, if K is an ordered field, then :
typeΦα=typeΦα and signΦα=signΦα.*
For example, for a fixed h∈A, the symmetric bilinear from Φh:A×A→K, (f,f′)↦TrKA(hff′) is the generalized trace from on A with respect to the K-linear form λh:A→A, g↦hg.
We shall use these particular generalized trace forms on A and the following partial generalization of the Theorem 4.5 in the proof of Theorem 5.5.
4.10 Theorem
(cf. [3, Theorem 3.4])*
Let α be a K-linear form on a finite algebra A over a
real closed field K. If signΦα=0, then A has a K-rational point, i. e.K-SpecA=∅.*
5. Counting rational points of 0-dimensional affine algebraic sets
In this section we will apply results from Section 4 on trace forms to count the rational points of finite affine algebraic sets over real closed fields. Our method is a modern version of old results of Hermite and Sylvester who had used signatures of quadratic forms to count real zeros of polynomials in one variable, see [7], [8] and [19]. We use elementary commutative algebra to treat multivariate versions of these problems.
5.1
Notation, Assumptions and Consequences* Throughout this section, we use the following notation and assumptions and their consequences :*
*Let K be a real closed field with notation as in 3.1.
For an ideal A⊆K[X1,…,Xn] in the polynomial ring
K[X1,…,Xn] over K, let
VK(A):={a∈Kn∣F(a)=0 for all F∈A}
and
V\mathdsK(A):={a∈\mathdsKn∣F(a)=0 for all F∈A}.
be the affine algebraic set in Kn and in \mathdsKn defined by A, respectively.*
Polynomials in K[X1,…,Xn] are denoted by capital letters F, G, H, … and their images in the residue class K-algebra A:=K[X1,…,Xn]/A are denoted by small letters f, g, h, … .
*Every element f∈A defines a (regular or polynomial) function on VK(A), namely f:VK(A)⟶K, a⟼f(a). Further, if f, g∈A, then, clearly :
f=g on VK(A)⟺f=g in A⟺F≡G(modA), i. e. F−G∈A.*
We assume that the residue class K-algebra A:=K[X1,…,Xn]/A is finite dimensional K-vector space, or equivalently, the affine algebraic set VK(A)⊆Kn is a finite set.
These assumptions are equivalent with the conditions : the \mathdsK-algebra \mathdsK⊗KA=A\mathdsK=\mathdsK[X1,…,Xn]/⟨A⟩ is finite dimensional over \mathdsK, or equivalently, the affine algebraic subset V\mathdsK(A)⊆\mathdsKn is a finite set.
Further, since A⊆K[X1,…,Xn], it follows that if a∈V\mathdsK(A), then its conjugate a∈V\mathdsK(A), too. Therefore, since VK(A)⊆V\mathdsK(A), renumbering we assume that :
where r:=#VK(A), r+s=#SpmA and
m:=r+2s=DimKA=Dim\mathdsKA\mathdsK=#V\mathdsK(A).
Furthermore, since K is a real closed field, CharK=0, in particular, K is infinite and hence by a linear change of coordinates (over K) (for instance, Yi=Xi for all i=1,…,n−1 and Yn=Xn+∑i=1n−1Xiti for suitable t∈K avoiding finitely many), we may assume that V\mathdsK(A) is in general Xn-position, or the ideal A in general Xn-position (The intention is to separate all zeros in an algebraic closure of K by their last coordinate), i. e. :
***5.1.b *** The n-th coordinates ain of the points ai=(ai1,…,ain)∈\mathdsKn, i=1,…,m are all distinct.
Note that VK(A)=V\mathdsK(A)∩Kn is the set of K-rational points of {\rm V}_{\mathds{K}}(\mathfrak{A})\!\stackrel{{\scriptstyle\raise 1.0pt\hbox{\mathchoice{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}}}}{{\longrightarrow}}\!K\operatorname{\!-Spec}A_{\mathds{K}}=\operatorname{Spm}A_{\mathds{K}}=\mathrm{Spec}A_{\mathds{K}} (the first equality follows from Hilbert’s Nullstellensatz, see [12] or [6, Theorem 2.10, HNS 3] ) and
{\rm V}_{K}(\mathfrak{A})\stackrel{{\scriptstyle\raise 1.0pt\hbox{\mathchoice{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}}}}{{\longrightarrow}}K\operatorname{\!-Spec}A\subseteq\operatorname{Spm}A\!=\!\mathrm{Spec}A, see 4.1 (b). Further, since A and A\mathdsK are reduced, the local components (see 4.1 (c)) of A corresponding to the K-rational points ai∈VK(A), i=1,…,r, are isomorphic to K and corresponding to M∈SpmA∖K-SpecA are isomorphic to \mathdsK, but local components of A\mathdsK corresponding to all the points a∈V\mathdsK(A) are all isomorphic to \mathdsK. Therefore the explicit structures of the K-algebra A and the \mathdsK-algebra A\mathdsK are determined by the algebra isomorphisms which are defined by the substitutions :
***5.1.c *** \,A\stackrel{{\scriptstyle\raise 1.0pt\hbox{\mathchoice{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}}}}{{\longrightarrow}}\enskip K^{r}\times\mathds{K}^{s}\,, h↦(h(modM))M∈SpmA, where r, s as in 5.1.a and
\,A_{\mathds{K}}\stackrel{{\scriptstyle\raise 1.0pt\hbox{\mathchoice{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}}}}{{\longrightarrow}}\enskip\mathds{K}^{m}, f↦(f(a))a∈V\mathdsK(A), where m:=r+2s.
Note that m=DimKA=Dim\mathdsKA\mathdsK=#V\mathdsK(A).
Furthermore, the following eigenvector theorem (see [4, Ch. 2, §4, Theorem 4.5] which follows directly from 5.1.b :
***5.1.d ***
For every h∈A,
the eigenvalues of the K-linear map λh:A→A, f↦hf are the values h(a1),…,h(ar), h(ar+1),h(ar+1)…,h(ar+s),h(ar+s) of the function h:V\mathdsK(A)→\mathdsK.
For more accessible determination of the signature of the trace form TrKA, we need a nice basis of A over K. The following crucial key observation so-called Shape Lemma (see [4], [5] and [11])
guarantees a distinguished generating set for a radical ideal A in K[X1,…,Xn].
We give a proof of the Shape Lemma by using the natural action of the Galois group Gal(K∣K) on
VK(A).*
5.2 Lemma
( Shape Lemma )*
Let K be an infinite perfect field and A⊆K[X1,…,Xn] be a radical ideal and let A:=K[X1,…,Xn]/A be a finite dimensional K-vector space. With further notation and assumptions as in 5.1. There exist polynomials g1,…,gn−1,gn∈K[T](where T is indeterminate over K) with gn=0 square free of degree m, such that A is generated by
X1−g1(Xn),…,Xn−1−gn−1(Xn), gn(Xn).
In particular, \mathbcalx={1,xn,…,xnm−1} is a K-basis of A, where xn is the image of Xn in A.*
Proof
Let K be an algebraic closure of K. Since K is perfect, the field extension K∣K is a Galois extension. Let Gal(K∣K) be its Galois group.
Let VK(A):={a∈Kn∣F(a)=0 for all F∈A}. Then VK(A) is a finite set by assumption on A and the projection map q:VK(A)→K, (a1,…,an)↦an is injective (by assumption (see 5.1.b)), (i. e. q separates points in VK(A)).
The Galois group Gal(K∣K) operates on VK(A) with the natural operation :
Obviously, the image q(VK(A))=W1⊎⋯⊎Wℓ is the union of orbits of this operation and each orbit Wk=VK(πk) is the zero set of the irreducible polynomial πk∈K[T], k=1,…,ℓ, see [9] or [17, Ch. XI, §93, 93.2]. Therefore, since K is perfect, the polynomial gn:=π1⋯πℓ∈K[T] is square free and q(VK(A))=VK(gn), deggn=#VK(M)=m.
5.2.a For all an∈q(VK(A)), there exist polynomials g1,…,gn−1∈K[T] with deggi<deggn=m such that (g1(an),…,gn−1(an),an) is the unique point lying over an.
To prove 5.2.a, let an∈q(VK(A)) and (a1,…,an−1,an) be the unique point lying over an. We may assume that an∈W1={σj(an)∣j=1,…,d1,σ1=idK} with d1=#W1. Let Wi′ denote the orbit of ai. Then, since q is injective, #Wi′≤#W1=d1. Moreover, for all i=1,…,n−1,
Wi′={σj(ai)∣j=1,…,d1,σ1=idK}, but all σj(ai), j=1,…,d1, may not be distinct.
Now, since σj(an), j=1,…,d1, are distinct elements in K, by Lagrange’s Interpolation Formula 888 Although named after Lagrange, J. L. (1736 - 1813) who published it in 1795, the method was first discovered in 1779 by Waring, E. (1734 - 1798). It is also an easy consequence of a formula of Euler, L. (1707 - 1783) published in 1783. Lagrange’s Interpolation Formula :Let K be a field and let x1,…,xn∈K be distinct elements. Then for arbitrary elements y1,…,yn∈K, there exists a polynomial g∈K[X] of degree degg<n such that g(xi)=yi for every i=1,…,n. For a proof consider the polynomial g:=i=1∑nziyij=i∏(X−xj), where zi:=j=i∏(xi−xj).
, for each i=1,…,n−1, there exists a polynomial gi∈K[X], deggi<d1<deggn, such that gi(σj(an))=σj(ai) for all j=1,…,d1. Moreover, g1,…,gn−1∈K[X].
Finally we claim the equality A′:=⟨X1−g1(Xn),…,Xn−1−gn−1(Xn),gn(Xn)⟩=A. To prove this first note that
the substitution homomorphism K[X1,…,Xn−1,Xn]→K[Xn], Xi↦Xi−gi(Xn), i=1,…,n−1 and Xn↦gn(Xn), induces a K-algebra isomorphism
K[X_{1},\ldots,X_{n}]/\mathfrak{A}^{\prime}\stackrel{{\scriptstyle\raise 1.0pt\hbox{\mathchoice{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}}}}{{\longrightarrow}}K[X_{n}]/\langle g\rangle and K[X1,…,Xn]/A′ is reduced, since gn is separable over K. Therefore A′ is a radical ideal. Further, from 5.2.a it follows that VK(A′)=VK(A). Now, use Hilbert’s Nullstellensatz (see [2], [12] or [15, Theorem 2.10 , HNS 2]) to conclude the equality A′=A.∎
5.3 Remark
*The Shape Lemma 5.2 appeared first time in [5] which may be regarded as a natural generalization of the Primitive Element Theorem. Further, it gives a very useful presentation of the radical ideal A which allows to find the solution space VK(A) immediately, namely :
VK(A)={(g1(a),…,gn−1(a),a)∈Kn∣gn(a)=0}.
In other words the last coordinates are zeros of gn and for a fixed last coordinate an, all the other coordinates are determined by evaluation of polynomials gn−1,…,g1 at an : gn(an)=0, an−1=gn−1(an),…,a1=g1(an). This simple shape of the solution space VK(A) is quite convenient to work with.
The primary decomposition of A is given by the prime factorization of the polynomial gn.
Under the conditions on the polynomials g1,…,gn−1, gn∈K[X] as in the proof of the Shape Lemma 5.2, one can easily verify that X1−g1(Xn),…,Xn−1−gn−1(Xn),gn(Xn)
form a reduced (= minimal) Gröbner basis of the radical ideal A relative to the lexicographic order X1>X2>⋯>Xn. For a different proof of the Shape Lemma 5.2 see [11, Theorem 3.7.25] and a detailed recipe for solving systems of polynomial equations efficiently using the Shape Lemma 5.2 is also given in [11, Theorem 3.7.26]. The Shape Lemma 5.2 also appeared in [4, Ex. 16, § 4, Ch. 2].*
5.4
Consequence and identifcation*
Let K be a real closed field, \mathdsK:=\mathdsCK=K[i], i2=−1, the algebraic closure of K(see 3.1) and let A⊆K[X1,…,Xn]
a radical ideal. Suppose that A:=K[X1,…,Xn]/A is a finite dimensional K-vector space.*
Let g1,…,gn−1,g:=gn(X)∈K[X] are the polynomials as in the statement of the Shape Lemma 5.2 and let \varphi:A\stackrel{{\scriptstyle\raise 1.0pt\hbox{\mathchoice{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}}}}{{\longrightarrow}}K[X]/\langle g\rangle be the K-algebra isomorphism
as in the proof of the Shape Lemma 5.2. Then, since g is square-free and K is a real closed field (see Footnote 2), g=(X−a1)⋯(X−ar)π1⋯πs, ai∈K, i=1,…r and πj=(X−zj)(X−zj)∈K[X], zj∈\mathdsK∖K, j=1,…,s, where r, s and m=r+2s as in 5.1.a*, since φ is a K-algebra isomorphism.*
We use the above K-algebra isomorphism φ to identify A and A with ⟨g⟩ and K[X]/⟨g⟩, respectively. With this \mathbcalx:={1,x,…,xm−1} is a K-basis of A, where x is the image of X in A and
VK(A)=VK(g)={a1,…,ar}⊆V\mathdsK(A)=V\mathdsK(g)={a1,…,ar,z1,z1,…,zs,zs}, r+2s=m.
Further, for H∈K[X1,…,Xn], H=0, we put
h(X):=H(g1(X),…,gn−1(X),X)∈K[X]. Then using the above identifications, we have
h(x)∈A, and the values H(ai)∈K, i=1,…,r, and H(ar+j), H(ar+j)∈\mathdsK,
j=1,…,s are identified with the values h(ai)∈K, i=1,…,r, and h(zj), h(zj)∈\mathdsK, j=1,…,s, respectively.
5.5 Theorem
With the notation as in 5.1, in 5.4, let
H∈K[X1,…,Xn], H=0, h be the image of H in A and let
Φh:A×A→K, (f,f′)↦TrKA(hff′), be the generalized trace form associated with h∈A. Then :
(a)
*The Gram’s matrix GΦh(\mathbcalx) of Φh with respect to the K-basis \mathbcalx is a symmetric matrix in Mm(K). Moreover, GΦh(\mathbcalx)=VDtV, where
V∈GLm(\mathdsK) is the Vandermonde’s matrix *999Vandermonde’s matrix For elements a1,…,am is a field K, the matrix V(a1,…,am):=(aij)0≤j≤m−11≤i≤m∈Mm(K) is called the Vanderminde’s matrix of the elements a1,…,am. The elements a1,…,am are pairwise distinct if and only if V(a1,…,am)∈GLm(K).
of the elements a1,…,ar,z1,…,zs,z1,…,zs∈\mathdsK and D∈Mm(\mathdsK) is the diagonal matrix with diagonal entries h(a1),…,h(ar),h(z1),…,h(zs),h(z1),…,h(zs).*
(b)
Let
pH:=#{a∈VK(A)∣H(a)>0} and qH:=#{a∈VK(A)∣H(a)<0}.
Then typeΦh=(pH+s,qH+s), where s=#(V\mathdsK(A)∖VK(A)) and rankΦh=#{a∈V\mathdsK(A)∣H(a)=0}. In particular, signΦh=pH−qH.
VK(A)={a1,…,ar}⊆V\mathdsK(A)={a1,…,ar,ar+1,ar+1,…,ar+s,ar+s},
where r:=#VK(A), r+s=#SpmA and m=r+2s=DimKA=Dim\mathdsKA\mathdsK=#V\mathdsK(A) and that
V\mathdsK(A) is in general Xn-position, see 5.1.a and 5.1.b.
(a) From the indentifcations in 5.4, it follows that
for 1≤k,ℓ≤m−1, the (k,ℓ)-entry in the Gram’s matrix GΦh(1,x,…,xm−1) is :
Now, by the Fundamental Theorem on Symmetric Polynomials (see [17, Theorem 54.13], the right hand side of 5.5.1 is a polynomial in the coefficients of h(X) and g(X) (with coefficients in \mathdsZ) and hence belongs to K. Therefore
GΦh(1,x,…,xm−1) is a symmetric matrix in Mm(K).
Furthermore, using the equation 5.5.1,
the equality
GΦh(1,x,…,xm−1)=VDtV,
where V and D are as in the statement of (a), can be easily verified.
(b) The assertion about the rank follows from the equality rankΦh=rankGΦ(\mathbcalx)=rankD, since V∈GLm(\mathdsK).
Further, note that the local decomposition \,A\stackrel{{\scriptstyle\raise 1.0pt\hbox{\mathchoice{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}}}}{{\longrightarrow}}K^{r}\times\mathds{K}^{s} (see 5.1.c) yields the orthogonal decomposition (see 4.2 and 2.13)
where (Φh)iK=Φh∣K, is the restrictions of Φh to the real component at ai∈K with Gram’s matrix G(Φh)iK(1)=(h(ai))∈M1(K), i=1,…,r) and (Φh)j\mathdsK=Φh∣\mathdsK,
is the restrictions of Φh to the non-real component \,K[X]/\langle\pi_{j}\rangle\stackrel{{\scriptstyle\raise 1.0pt\hbox{\mathchoice{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}{\vbox to0.0pt{\hbox{∼}\vss}}}}}{{\longrightarrow}}\mathds{K}\,
at Mj=⟨πj⟩∈SpmA∖K-SpecA, j=1,…,s. Furthermore, clearly, type(Φh)iK=sign(Φh)iK=signh(ai)=signH(ai) for all i=1,…,r and by Example 3.10 (since πj=(X−zj)(X−zj), zj∈\mathdsK∖K), we have type(Φh)j\mathdsK=(1,1) for all j=1,…,s. Therefore, by Corollary 4.4 :
(Hermite)*
Let K be an arbitrary real closed field and let g=b0+b1X+⋯+bm−1Xm−1+Xm∈K[X], A:=K[X]/⟨g⟩. Then the typeTrKA=(r+s,s), where TrKA:A×A→K, (f,f′)↦TrK(ff′) is the trace form on A, r=#VK(g) is the number of zeros of g in K and s is the half of the number of zeros of g in the algebraic closure \mathdsK of K which are not in K. In particular,
signTrKA=r=#VK(g).*
Proof
Using the notations as in the Theorem 5.5, note that TrKA=Φ1, where 1∈K[X] denote the constant polynomial. Therefore p1=r=#VK(g), q1=0 and
signTrKA=p1−q1=r=#VK(g) by
by 5.5 (c). Of course, the assertion also follows directly from Theorem 4.5. ∎
With the notation and assumptions as in 5.1, our main goal is to relate the cardinality #VK(A) with the signatures of the generalized trace form on the finite K-algebra A.
5.7
Notation* With the notation and assumptions as in 5.1 and as 5.4. Further, let H∈K[X1,.....,Xn], H=0 and VK(H):={a∈Kn∣H(a)=0} be the hypersurface (an (n−1)-dimensional affine algebraic set in Kn) defined by H. Then the complement of VK(H) in Kn is the union of line-connected subsets (in the strong topology on Kn (see Footnote 5) on which H takes either all positive values or all negative values, i. e. Kn∖VK(H)=H+⊎H−,
where H+:={a∈Kn∣H(a)>0}, and H−:={a∈Kn∣H(a)<0}.*
*Further, since
VK(A)=(VK(A)∩H+)⨄(VK(A)∩H−)⨄(VK(⟨A,H⟩)), we have :
and hence
to compute #VK(A), we can use arbitrary polynomial H∈K[X1,…,Xn] and compute the cardinalities #VK(A)∩H+, #VK(A)∩H− and #VK(⟨A,H⟩).*
More precisely, we have :
5.8 Theorem
With the notation and assumptions as in 5.1 and 5.7. For H∈K[X1,.....,Xn], H=0, let
pH:=#VK(A)∩H+ and qH:=#VK(A)∩H−. Further, let h denote the image of H in A=K[X1,.....,Xn]/A and Φh:A×A→K, (f,g)↦TrKA(hfg)(resp. Φh2:A×A→K,
(f,g)↦TrKA(h2fg)) be the generalized trace forms defined
by h(resp. by h2) on A. Then :
signΦh=pH−qH and rankΦh=#(V\mathdsK(A)∖V\mathdsK(H)).*
(b)
signΦh2=pH+qH* and rankΦh2=#(V\mathdsK(A)∖V\mathdsK(H)).*
(c)
*Let B:=⟨A,H⟩ be the ideal *(in K[X1,…,Xn]) generated by A and H. Then the K-algebra B:=K[X1,…,Xn]/B is finite over K and signTrKB=#VK(B).
(d)
The three signatures signΦh, signΦh2 and signTrKB uniquely determine the natural numbers pH, qH and #VK(B)=VK(A)∩VK(H). In particular, they determine the cardinality #VK(A)=ph+qh+#VK(B).
(b) : Since H2(a)=H(a)H(a)>0 for every a∈H+∪H− and VK(H2)=VK(H), from Theorem 5.5 (b) it follows that :
signΦh2=pH+qH and
rankΦh2=#(VK(A)∖VK(H)).
(c) : Since the K-algebra B is a homomorphic image of the K-algebra A, B is also finite over K. The equality signTrKB=#VK(B) is immediate from Theorem 5.5 (a) (H=1) or Theorem 4.5.
(d) : Immediate from the formula 5.7.a for #VK(A) in 5.7 and (a) and (b). ∎
Bibliography19
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[1] Artin, M. : Algebra . Prentice Hall of India, New Delhi (1994), xviii+618 pp.
2[2] Atiyah, M. F. and Macdonald, I. G. : Introduction to commutative algebra . Addison-Wesley, Reading, Mass. (1969), x+128 pp.
3[3] Böttger, S. and Storch, U. : On Euler’s Proof of the Fundamental Theorem of Algebra, Journal of Indian Institute of Science , Vol. 91, No. 1 (2011), 69-91.
4[4] Cox, David, A., Little, John and O’Shea, Donald : Using Algebraic Geometry , Graduate Texts in Mathematics, 185, (2 nd edition), Springer-Verlag, New York (2005).
5[5] Gianni, P. and Mora, T. : Algebraic solutions of systems of polynomial equations using Gröbner bases, Proc. AAECC 5, LNCS 356 (1989), 247-257.
6[6] Goel, Kriti , Patil, Dilip P. and Verma, Jugal : Nullstellensätze and Applications, Preprint, IIT Bombay 2018.
7[7] Hermite, C. : Sur L’Extension du Théorème de M. Sturm a un Système D’Équations Simultanées, C. R. Acad. Sci., Paris 35 (1852).
8[8] Hermite, C. : Sur L’Extension du Théorème de M. Sturm a un Système D’Équations Simultanées, Oeuvres de Charles Hermite , Tome III., 1-34.