Certification for Polynomial Systems via Square Subsystems

Timothy Duff; Nickolas Hein; and Frank Sottile

arXiv:1812.02851·math.AG·July 7, 2020·J. Symb. Comput.

Certification for Polynomial Systems via Square Subsystems

Timothy Duff, Nickolas Hein, and Frank Sottile

PDF

TL;DR

This paper introduces methods for certifying solutions to overdetermined polynomial systems by leveraging square subsystems and advanced algebraic techniques, enabling reliable solution verification and completeness checks.

Contribution

It proposes novel certification approaches for overdetermined polynomial systems using square subsystems and algebraic tools like liaison and Newton-Okounkov bodies.

Findings

01

Effective certification of solutions to overdetermined systems.

02

Methods to reject nonsolutions and verify completeness.

03

Use of algebraic geometry tools for solution certification.

Abstract

We consider numerical certification of approximate solutions to a system of polynomial equations with more equations than unknowns by first certifying solutions to a square subsystem. We give several approaches that certifiably select which are solutions to the original overdetermined system. These approaches each use different additional information for this certification, such as liaison, Newton-Okounkov bodies, or intersection theory. They may be used to certify individual solutions, reject nonsolutions, or certify that we have found all solutions.

Equations115

N_{f} (z) := z - D f (z)^{- 1} f (z),

N_{f} (z) := z - D f (z)^{- 1} f (z),

N_{f, †} (z) := z - D f (z)^{†} f (z),

N_{f, †} (z) := z - D f (z)^{†} f (z),

g_{i} (z) := \partial_{z_{i}} (f_{1}^{2} + \dots + \dots + f_{N}^{2}) = 0 (1 \leq i \leq n) .

g_{i} (z) := \partial_{z_{i}} (f_{1}^{2} + \dots + \dots + f_{N}^{2}) = 0 (1 \leq i \leq n) .

V_{f} = {(z, p) \in C^{n} \times C^{m} ∣ f (z, p) = 0} .

V_{f} = {(z, p) \in C^{n} \times C^{m} ∣ f (z, p) = 0} .

∥(D^{k} g)_{ζ} ∥ := w \in S^{k} C^{n} ∥ w ∥ = 1 max ∥(D^{k} g)_{ζ} (w)∥ .

∥(D^{k} g)_{ζ} ∥ := w \in S^{k} C^{n} ∥ w ∥ = 1 max ∥(D^{k} g)_{ζ} (w)∥ .

∣ f (\hat{ζ}) ∣ - k = 1 \sum d e g f_{j} \frac{∥( D ^{k} f ) _{\hat{ζ}} ∥}{k !} \cdot ρ^{k} > 0,

∣ f (\hat{ζ}) ∣ - k = 1 \sum d e g f_{j} \frac{∥( D ^{k} f ) _{\hat{ζ}} ∥}{k !} \cdot ρ^{k} > 0,

∥ \hat{ζ_{1}} - \hat{ζ_{2}} ∥ > ρ_{1} + ρ_{2} .

∥ \hat{ζ_{1}} - \hat{ζ_{2}} ∥ > ρ_{1} + ρ_{2} .

\displaystyle\alpha(g,\hat{\zeta})\

\displaystyle\alpha(g,\hat{\zeta})\

\displaystyle\beta(g,\hat{\zeta})\

\displaystyle\gamma(g,\hat{\zeta})\

α (g, \hat{ζ}) < \frac{13 - 3 17}{4} \approx 0.15767078,

α (g, \hat{ζ}) < \frac{13 - 3 17}{4} \approx 0.15767078,

∥ \hat{ζ} - \hat{ζ}^{'} ∥ < \frac{1}{20 γ ( g , ζ ^ )},

∥ \hat{ζ} - \hat{ζ}^{'} ∥ < \frac{1}{20 γ ( g , ζ ^ )},

d = # (V (g) ∖ V (f)) .

d = # (V (g) ∖ V (f)) .

g\ :=\ \left(\begin{array}[]{c}g_{1}(z)\\ \vdots\\ g_{n}(z)\end{array}\right)\ =\ A\>\left(\begin{array}[]{c}f_{1}(z)\\ \vdots\\ f_{N}(z)\end{array}\right)\ =\ 0\,.

g\ :=\ \left(\begin{array}[]{c}g_{1}(z)\\ \vdots\\ g_{n}(z)\end{array}\right)\ =\ A\>\left(\begin{array}[]{c}f_{1}(z)\\ \vdots\\ f_{N}(z)\end{array}\right)\ =\ 0\,.

ρ^{'} + ρ_{j} > ∥ \hat{ζ_{j}} - \hat{ζ}^{'} ∥,

ρ^{'} + ρ_{j} > ∥ \hat{ζ_{j}} - \hat{ζ}^{'} ∥,

U_{L} := {z \in X ∖ X_{s in g} ∣ L_{i} \subset O_{X, z} \mbox f or i = 1, \dots, n},

U_{L} := {z \in X ∖ X_{s in g} ∣ L_{i} \subset O_{X, z} \mbox f or i = 1, \dots, n},

Z_{L} := i = 1 ⋃ n {z \in U_{L} ∣ f (z) = 0 \forall f \in L_{i}},

Z_{L} := i = 1 ⋃ n {z \in U_{L} ∣ f (z) = 0 \forall f \in L_{i}},

(α_{1}, k_{1}) ≺_{t} (α_{2}, k_{2}) if k_{1} > k_{2} or k_{2} = k_{1} and α_{1} ≺ α_{2} .

(α_{1}, k_{1}) ≺_{t} (α_{2}, k_{2}) if k_{1} > k_{2} or k_{2} = k_{1} and α_{1} ≺ α_{2} .

\Psi_{L}\ \colon\ X\ --\to\ \mathbb{P}(L^{*})\qquad z\ \mapsto\ \big{[}f\mapsto f(z)\big{]}\,,

\Psi_{L}\ \colon\ X\ --\to\ \mathbb{P}(L^{*})\qquad z\ \mapsto\ \big{[}f\mapsto f(z)\big{]}\,,

d_{L} = \frac{n ! de g Ψ _{L}}{ind ( A _{L} , ν )} \cdot Vol Δ (A_{L}, ν) .

d_{L} = \frac{n ! de g Ψ _{L}}{ind ( A _{L} , ν )} \cdot Vol Δ (A_{L}, ν) .

\left(\begin{array}[]{c}f_{1}(z_{1},z_{2},z_{3})\\ f_{2}(z_{1},z_{2},z_{3})\\ f_{3}(z_{1},z_{2},z_{3})\\ f_{4}(z_{1},z_{2},z_{3})\end{array}\right)=\left(\begin{array}[]{c}{z}_{1}^{2}+{z}_{2}^{2}-1,\\ -{16}\,{z}_{2}^{2}+{8}\,{z}_{1}+{17},\\ -{z}_{2}^{2}+{z}_{1}-{z}_{3}-1,\\ {64}\,{z}_{1}{z}_{2}+{16}\,{z}_{2}\end{array}\right)

\left(\begin{array}[]{c}f_{1}(z_{1},z_{2},z_{3})\\ f_{2}(z_{1},z_{2},z_{3})\\ f_{3}(z_{1},z_{2},z_{3})\\ f_{4}(z_{1},z_{2},z_{3})\end{array}\right)=\left(\begin{array}[]{c}{z}_{1}^{2}+{z}_{2}^{2}-1,\\ -{16}\,{z}_{2}^{2}+{8}\,{z}_{1}+{17},\\ -{z}_{2}^{2}+{z}_{1}-{z}_{3}-1,\\ {64}\,{z}_{1}{z}_{2}+{16}\,{z}_{2}\end{array}\right)

512 t^{2} z_{1}^{2} z_{3} + 6656 t^{2} z_{1} z_{3} - 6400 t^{2} z_{3}^{2} + 14000 t^{2} z_{1} - 26368 t^{2} z_{3} - 27125 t^{2}

512 t^{2} z_{1}^{2} z_{3} + 6656 t^{2} z_{1} z_{3} - 6400 t^{2} z_{3}^{2} + 14000 t^{2} z_{1} - 26368 t^{2} z_{3} - 27125 t^{2}

= t^{2} (64 f_{1} f_{2} - 21 f_{2}^{2} - 512 f_{1} f_{3} + 768 f_{2} f_{3} - 6400 f_{3}^{2} + \frac{1}{8} f_{4}^{2}) \in A_{L},

x:=\left(\begin{array}[]{c}2\\ 0\\ 1\\ 2\end{array}\right)\in S(A_{L}).

x:=\left(\begin{array}[]{c}2\\ 0\\ 1\\ 2\end{array}\right)\in S(A_{L}).

\Delta(A_{L},<_{t})=\operatorname{\rm conv}\Bigg{(}\left(\begin{array}[]{c}1\\ 0\\ 0\\ 1\end{array}\right),\left(\begin{array}[]{c}0\\ 2\\ 0\\ 1\end{array}\right),\left(\begin{array}[]{c}1\\ 1\\ 0\\ 1\end{array}\right),\left(\begin{array}[]{c}2\\ 0\\ 0\\ 1\end{array}\right),\left(\begin{array}[]{c}1\\ 0\\ 1/2\\ 1\end{array}\right)\Bigg{)}

\Delta(A_{L},<_{t})=\operatorname{\rm conv}\Bigg{(}\left(\begin{array}[]{c}1\\ 0\\ 0\\ 1\end{array}\right),\left(\begin{array}[]{c}0\\ 2\\ 0\\ 1\end{array}\right),\left(\begin{array}[]{c}1\\ 1\\ 0\\ 1\end{array}\right),\left(\begin{array}[]{c}2\\ 0\\ 0\\ 1\end{array}\right),\left(\begin{array}[]{c}1\\ 0\\ 1/2\\ 1\end{array}\right)\Bigg{)}

X = V (f), X \cup Y = V (g), \mbox an d Y = V (h) .

X = V (f), X \cup Y = V (g), \mbox an d Y = V (h) .

\mathcal{V}(g_{1},\dotsc,g_{n})\ =\ \bigl{(}X\cap\mathcal{V}(g_{r+1},\dotsc,g_{n})\bigr{)}\>\cup\>\bigl{(}Y\cap\mathcal{V}(g_{r+1},\dotsc,g_{n})\bigr{)}\>.

\mathcal{V}(g_{1},\dotsc,g_{n})\ =\ \bigl{(}X\cap\mathcal{V}(g_{r+1},\dotsc,g_{n})\bigr{)}\>\cup\>\bigl{(}Y\cap\mathcal{V}(g_{r+1},\dotsc,g_{n})\bigr{)}\>.

V (h_{1}, \dots, h_{r}, g_{r + 1}, \dots, g_{n}) = Y \cap V (g_{r + 1}, \dots, g_{n}) .

V (h_{1}, \dots, h_{r}, g_{r + 1}, \dots, g_{n}) = Y \cap V (g_{r + 1}, \dots, g_{n}) .

(-\tfrac{1}{3},-\tfrac{1}{3},-\tfrac{1}{3})\ \mbox{ on $\ell$ \ and \ }(-1,1,-1)\mbox{ and }(\pm\sqrt{-1},-1,\mp\sqrt{-1})\ \mbox{ on $C$\,.}

(-\tfrac{1}{3},-\tfrac{1}{3},-\tfrac{1}{3})\ \mbox{ on $\ell$ \ and \ }(-1,1,-1)\mbox{ and }(\pm\sqrt{-1},-1,\mp\sqrt{-1})\ \mbox{ on $C$\,.}

f = (f_{1}, \dots, f_{s}, g_{r + 1}, \dots, g_{n})

f = (f_{1}, \dots, f_{s}, g_{r + 1}, \dots, g_{n})

X_{1} ⋂ X_{2} ⋂ \dots ⋂ X_{m}

X_{1} ⋂ X_{2} ⋂ \dots ⋂ X_{m}

f_{1} = f_{2} = \dots = f_{n} = 0 .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Certification for Polynomial Systems

via Square Subsystems

Timothy Duff

School of Mathematics

Georgia Institute of Technology

686 Cherry Street

Atlanta, GA 30332-0160

USA

[email protected] http://people.math.gatech.edu/ tduff3/ ,

Nickolas Hein

Department of Mathematics and Computer Science

Benedictine College

1020 N. 2nd St

Atchison, KS 66002

USA

[email protected] https://www.benedictine.edu/faculty-staff/hein-nickolas and

Frank Sottile

Department of Mathematics

Texas A&M University

College Station

Texas 77843

USA

[email protected] www.math.tamu.edu/~sottile

Abstract.

We consider numerical certification of approximate solutions to a system of polynomial equations with more equations than unknowns by first certifying solutions to a square subsystem. We give several approaches that certifiably select which are solutions to the original overdetermined system. These approaches each use different additional information for this certification, such as liaison, Newton-Okounkov bodies, or intersection theory. They may be used to certify individual solutions, reject nonsolutions, or certify that we have found all solutions.

Key words and phrases:

certified solutions, alpha theory, polynomial system, numerical algebraic geometry, Newton-Okounkov bodies, Schubert calculus

2010 Mathematics Subject Classification:

65G20, 65H10

Work of Sottile supported in part by the National Science Foundation under grant DMS-1501370

Work of Duff supported in part by the National Science Foundation under grant DMS-1719968

Duff and Sottile supported by the ICERM

1. Introduction

Given polynomials $f=(f_{1},\dotsc,f_{N})$ with $f_{i}\in\mathbb{C}[z_{1},\dotsc,z_{n}]$ , an approximate solution to the system $f_{1}(z)=\dotsb=f_{N}(z)=0$ is an estimate $\hat{\zeta}$ of some point $\zeta$ where the polynomials all vanish ( $\zeta$ is a solution to $f$ ), such that the approximation error $\|\zeta-\hat{\zeta}\|$ can be refined efficiently as a function of the input size and desired precision. Numerical certification seeks criteria and algorithms that guarantee that a computed estimate $\hat{\zeta}$ of a solution $\zeta$ to $f$ is an approximate solution in this sense.

Many existing certification methods [21, 35] are for square systems, where $N=n$ . These exploit that the isolated, nonsingular solutions to the system are exactly the fixed points of the Newton operator $N_{f}\colon\mathbb{C}^{n}\to\mathbb{C}^{n}$ given (where defined) by

[TABLE]

where $Df(z)$ is the Jacobian matrix of the system $f$ evaluated at $z$ . A Newton-based certificate establishes that the sequence of Newton iterates $(N_{f}^{k}(\hat{\zeta})\mid k\in\mathbb{N})$ converges to a solution $\zeta$ to $f$ . Examples include both Smale’s $\alpha$ -test [34, 35] (typically performed in rational arithmetic) and Krawczyk’s method [21] (based on interval arithmetic).

Once such a certificate is in hand, we say that $\hat{\zeta}$ is an approximate solution to $f$ with associated solution $\zeta$ . Further refinements bound the distance to the associated solution $\|\zeta-\hat{\zeta}\|$ , decide if two approximate solutions are associated to the same solution, and, in the case of real systems, decide if the associated solution is real [12].

Certification in the overdetermined case, where $N>n$ , poses challenges not encountered in the square case. A detailed study of the “least-squares” Newton operator,

[TABLE]

was undertaken by Dedieu and Shub [5]. Here, “ $\dagger$ ” indicates the Moore-Penrose pseudoinverse. Among the contributions in their work is a generalization of Smale’s $\alpha$ -theorem, Theorem 3 in [5], giving sufficient conditions for the convergence of $(N_{f,\dagger}^{k}(\hat{\zeta})\mid k\in\mathbb{N})$ to a fixed point. Since there may be many fixed points of $N_{f,\dagger}$ which are not solutions to $f,$ this criterion is generally not sufficient to certify that $\hat{\zeta}$ is an approximate solution to $f.$ In fact, the fixed points of $N_{f,\dagger}$ are precisely the solutions to the square system defined by

[TABLE]

Clearly the solutions to $g$ include the solutions to $f$ : we thus say that $g$ is a square subsystem of $f.$ Our point of view in this paper is that certification of approximate solutions may be possible if we are given a square subsystem of $f$ —typically not the system in Equation 3—together with some global information about the excess solutions.

Alternate approaches to certification in the overdetermined case have been considered in previous work. In [1] a hybrid symbolic-numeric approach is used when the polynomials in $f$ have rational coefficients. This requires computing an exact rational univariate representation [32] and using that to certify approximate solutions. Alternatively, one may attempt to lift the approximation $\hat{\zeta}$ to an approximate solution to a square system in more variables. This is the approach taken for Schubert problems in [11, 13].

Though reduction to the case of a square subsystem is a natural idea, a suitable square subsystem $g$ and the requisite global information are not easily obtained in general. Rather than prescribing a single approach, we follow the pattern of this reduction through a series of algorithms and examples. We highlight how abstract tools such as liaison theory and Newton-Okounkov bodies may be brought to bear on certification, and illustrate certification for problems of interest in the Schubert calculus and computer vision.

Remark 1.1.

It is well-understood that perturbing an overdetermined system, eg. by adding generic constants, will generally produce an inconsistent system. Nevertheless, there are many families of overdetermined polynomial systems such that a generic member of the family has finitely many isolated solutions. In other words, a family of overdetermined systems need not be over-constrained. In general, a family of polynomial systems specified by some parameters $p\in\mathbb{C}^{m}$ may be understood via the incidence variety

[TABLE]

The family is well-constrained if the projection onto $\mathbb{C}^{m}$ is dominant and $\dim V_{f}=m.$ The examples in Sections 5.2 and 5.3 fit naturally into the category of well-constrained families of overdetermined systems, where it is reasonable to seek certificates even for generic $p.$

The algorithms in our paper address the problems below.

Problem 1. How may we certify that a point $\zeta\in\mathbb{C}^{n}$ is an approximate solution to $f$ ?

Problem 2. Suppose it is known that $f$ has $e$ solutions. How may we certify that a set $Z\subset\mathbb{C}^{n}$ of $e$ points consists of approximate solutions to $f$ ?

In Section 2, we recall various notions of “approximate solution” that have been considered in the literature; our Definitions 2.1 and 2.5 encompass these various notions and apply just as well to the overdetermined case. We also explain a simple exclusion criterion that indirectly certifies particular approximate solutions to a subsystem $g$ of $f$ as non-solutions to $f$ . In Section 3, we explain how global information about the excess solutions to $g$ may be used to certify approximate solution to $f$ in the sense of Section 2, thus solving Problems 1 and 2. We also give an alternative approach for Problem 2 that incorporates global information about $f,$ and explain how the global information about $g$ can be calculated in terms of Khovanskii bases and their associated Newton-Okounkov bodies in Section 3.2. Section 4 discusses one further approach to Problem 1 that is based on liaison theory. In Section 5, we give three examples illustrating our algorithms. One involves a finite Khovanskii basis, another is from the Schubert calculus, and a third is from computer vision.

2. Approximate solutions

Throughout, we will fix positive integers $n\leq N$ . All polynomials will lie in the ring $\mathbb{C}[x_{1},\dotsc,x_{n}]$ . We will write $f$ for a system $f_{1},\dotsc,f_{N}$ of $N$ polynomials. The system $f$ is square when $N=n$ .

Definition 2.1.

A $\rho$ -approximate solution to a (possibly overdetermined) polynomial system $f$ is a triple $(\hat{\zeta},\rho,\mathcal{N}_{f})$ , where $\hat{\zeta}\in\mathbb{C}^{n}$ , $\rho\in\mathbb{R}_{>0}$ , and $\mathcal{N}_{f}\colon U\to\mathbb{C}^{n}$ is a map defined on some $U\subset\mathbb{C}^{n}$ such that

There exists $\zeta\in\mathcal{V}(f)$ such that $\lVert\zeta-\hat{\zeta}\rVert<\rho$ , and

2)

all iterates $\mathcal{N}_{f}^{k}(\hat{\zeta})$ are defined and the sequence $\mathcal{N}_{f}^{k}(\hat{\zeta})$ converges to $\zeta$ as $k\to\infty$ .

Here $\lVert\cdot\rVert$ indicates the usual Hermitian norm on $\mathbb{C}^{n}$ . We will refer to $\hat{\zeta}$ as an approximate solution when the procedure $\mathcal{N}_{f}$ and constant $\rho$ are understood. We call the point $\zeta\in\mathcal{V}(f)$ in (1) the solution to $f$ associated to $\hat{\zeta}$ . We sometimes refer to $\mathcal{N}_{f}$ as a refinement operator. This is typically some incarnation of Newton’s method.

Remark 2.2.

In our examples, the system $f$ and the approximate solution $\hat{\zeta}$ are defined over the rationals $\mathbb{Q}$ or the Gaussian rationals $\mathbb{Q}[\sqrt{-1}]$ . Since numerical solvers typically output floating point results, care must be taken to control rounding errors when computing certificates. One option for certification is to perform all subsequent operations in rational arithmetic. Interval and ball arithmetic give yet another approach (discussed in Subsection 2.2). A “certificate” obtained without controlling rounding errors may still be of practical value. Following [12], we call this a soft certificate.

Remark 2.3.

In practice, the map $\mathcal{N}_{f}$ in Definition 2.1 should restrict to a computable function $\mathcal{N}_{f,\mathbb{Q}}\colon U\cap\mathbb{Q}[\sqrt{-1}]^{n}\to\mathbb{Q}[\sqrt{-1}]^{n}$ . Our algorithms assume an oracle for $\mathcal{N}_{f},$ and we generally take a naive approach to questions of computability and complexity. However, we do not rely on special features of nonstandard models of computation such as the Blum-Shub-Smale machine [3].

Let $g$ be a square sytem and $(\hat{\zeta},\rho,\mathcal{N}_{g})$ be an approximate solution to $g$ with associated solution $\zeta$ . Our main concern is to certify that $\zeta\in\mathcal{V}(f)$ when $g$ is a square subsytem of $f$ —a seemingly difficult task a priori. It is however relatively simple to certify that $\zeta$ is not a solution to a single polynomial $f$ , provided that $\rho$ is sufficiently small. For $k\in\mathbb{N}$ , let $S^{k}\mathbb{C}^{n}$ be the $k$ th symmetric power of $\mathbb{C}^{n}$ . This has a norm $\lVert\cdot\rVert$ dual to the standard unitarily invariant norm on homogeneous polynomials, and which satisfies $\|z^{k}\|\leq\|z\|^{k}$ , for $z\in\mathbb{C}^{n}$ . The $k$ -th derivative of $g$ at $\zeta$ is a linear map $(D^{k}\,g)_{\zeta}:S^{k}(\mathbb{C}^{n})\to\mathbb{C}^{n}$ with operator norm,

[TABLE]

Proposition 2.4.

Suppose that $(\hat{\zeta},\rho,\mathcal{N}_{g})$ is an approximate solution to a square polynomial system $g$ with associated solution $\zeta$ . For any polynomial $f$ , if

[TABLE]

then $f(\zeta)\neq 0$ .

Proof.

By Taylor expansion, it follows that $f(z)\neq 0$ for any $z\in B(\hat{\zeta},\rho)$ . ∎

Let us write $\delta(f,g,\hat{\zeta})$ for the difference in the inequality (5), which we will call a Taylor residual. Note the implicit dependence on $\rho$ in Definition 2.1. If $f=(f_{1},\dotsc,f_{N})$ is a polynomial system, then we define its Taylor residual $\delta(f,g,\hat{\zeta})$ to be maximum of the Taylor residuals $\delta(f_{i},g,\hat{\zeta})$ , for $i=1,\dotsc,N$ . For this test of nonvanishing using Taylor residuals to be practical, we need to estimate the operator norms of the higher derivatives. One possible bounding strategy, as explained in [34, §I-3] and [12, §1.1], uses the first derivative alone. Another option, less suitable for polynomials of high degree, is to bound with the entry-wise $\ell_{2}$ or $\ell_{1}$ norms of these tensors.

A consequence of Definition 2.1 is that each iterate $\mathcal{N}_{f}^{k}(\hat{\zeta})$ is an approximate solution, as $\mathcal{N}_{f}^{k}(\hat{\zeta})\to\zeta$ . We wish to quantify this rate of convergence. The triangle inequality gives a test for when approximate solutions $(\hat{\zeta_{1}},\rho_{1},\mathcal{N}_{f})$ and $(\hat{\zeta_{2}},\rho_{2},\mathcal{N}_{f})$ have distinct associated solutions, namely if

[TABLE]

It is useful to have some additional criterion when two approximate solutions have the same associated solutions, that is, we wish to certify uniqueness of the associated solution in a sufficiently small region. This motivates our next definition.

Definition 2.5.

An effective approximate solution $(\hat{\zeta},\mathcal{N}_{f},\rho,k_{*})$ to a system $f$ consists of a weakly decreasing rate function $\rho\colon\mathbb{N}\to\mathbb{R}_{>0}$ with $\lim_{k\to\infty}\rho(k)=0$ , and an integer $k_{*}$ such that

$\hat{\zeta}$ is a $\rho(0)$ -approximate solution to $f$ with associated solution $\zeta$ ,

2)

$\lVert\mathcal{N}_{f}^{j}(\hat{\zeta})-\zeta\rVert<\rho(k)$ for all $j\geq k$ , and

3)

For some iterate $k_{*}$ , $\zeta$ is the unique solution in the ball $B\left(\mathcal{N}_{f}^{(k_{*})}(\hat{\zeta}),\,2\,\rho(k_{*})\right)$ .

We say the rate of convergence for the effective approximate solution has order $\rho(k)$ .

The rate of convergence is quadratic when $\rho(k)=2^{-2^{O(k)}}\,\lVert\zeta-\hat{\zeta}\rVert$ . This implies that each application of $\mathcal{N}_{f}(\cdot)$ roughly doubles the number of significant digits in $\hat{\zeta}$ . We generalize the method for certifying distinct solutions in [12, §I-2].

Proposition 2.6.

Given a set of effective approximate solutions $S^{\prime}=\{(\hat{\zeta_{i}},\mathcal{N}_{f},\rho_{i},k_{*}^{i})\}$ to a system $f$ , we may compute a set $S$ of refined approximate solutions with distinct associated solutions comprising all solutions associated to the set $S^{\prime}$ .

Proof.

We need only replace each $\hat{\zeta_{i}}$ with its refinement $\mathcal{N}_{f}^{k_{*}^{i}}(\hat{\zeta_{i}})$ . After refinement, the solutions associated to $\hat{\zeta_{i}}$ and $\hat{\zeta_{j}}$ are distinct if and only if inequality (6) holds. ∎

Proposition 2.4 may fail to certify that an extraneous solution $\zeta$ is not a solution to $f$ . However, if $\hat{\zeta}$ is an effective approximate solution for $\mathcal{N}_{g},$ then this test will succeed after sufficiently many refinements.

Corollary 2.7.

Let $f=(f_{1},\dotsc,f_{N})$ be a system of polynomals, and suppose $(\hat{\zeta},\mathcal{N}_{g},\rho,k_{*})$ is an effective approximate solution such that the associated solution $\zeta\not\in\mathcal{V}(f)$ . There is a $k\geq 0$ such that the Taylor residuals $\delta\left(f,g,\mathcal{N}_{g}^{i}(\hat{\zeta})\right)$ are positive for all $i\geq k$ .

Proof.

Since $\mathcal{N}_{g}^{j}(\hat{\zeta})\to\zeta,$ we may argue that $\delta\left(f,g,\mathcal{N}_{g}^{k}(\hat{\zeta})\right)>0$ for some $k$ by Taylor expansion as in Proposition 2.4. Since $\rho(\cdot)$ is weakly decreasing, it follows from 2) in Definition 2.5 that the Taylor residuals remain positive for all $i\geq k.$ ∎

Certificates for square systems are generally based on Newton’s method. We now observe that Definitions 2.1 and 2.5 encapsulate several existing certification paradigms for square systems.

2.1. Smale’s $\alpha$ -theory

The central quantities of Smale’s $\alpha$ -theory are defined as follows. With $g$ as above and $\hat{\zeta}\in\mathbb{C}^{n}$ a point where $Dg(\hat{\zeta})$ is invertible,

[TABLE]

Note that $\beta(g,\hat{\zeta})$ is the length of a Newton step at $\hat{\zeta}$ . The following proposition gives a criterion for approximate solutions in the sense of Definition 2.1.

Proposition 2.8 ([3, p. 160]).

Let $g$ be a square polynomial system and $\hat{\zeta}\in\mathbb{C}^{n}$ . If

[TABLE]

then $\hat{\zeta}$ is $2\beta(g,\hat{\zeta})$ -approximate solution to $g$ and the Newton iterates $N_{g}^{k}(\hat{\zeta})$ converge quadratically.

Criteria for quadratically convergent effective approximate solutions in the sense of Definition 2.5 can also be given in terms of $\alpha(g,\hat{\zeta})$ . The analysis amounts to showing that $N_{g}$ is a contraction mapping in a suitable neighborhood of $\zeta$ . This is given by the “robust” $\alpha$ -theorem (Theorem 6 and Remark 9 of [3, Ch. 8]).

Proposition 2.9.

Let $g$ be a square polynomial system and $\hat{\zeta}\in\mathbb{C}^{n}$ an approximate solution to $g$ with associated solution $\zeta$ and suppose that $\alpha(g,\hat{\zeta})<0.03$ . If $\hat{\zeta}^{\prime}\in\mathbb{C}^{n}$ satisfies

[TABLE]

then $\hat{\zeta}^{\prime}$ is an approximate solution to $g$ with associated solution $\zeta$ .

It follows that, for $\rho(k)=2^{-2^{k-1}}\beta(\hat{\zeta})$ , we have that $(\hat{\zeta},N_{g},\rho,0)$ is an effective approximate solution in the sense of Definition 2.5.

2.2. Other approaches

The classical analysis of Newton’s method is due to Kantorovich [14]. Several variations of Kantorovich’s theorem exist, typically assuming some local Lipchitz condition on the Jacobian $D_{g}$ and boundedness conditions on $D_{g}(\hat{\zeta})^{-1}$ . Certificates based on Kantorovich’s theorem thus rely on a priori bounds in a region containing $\hat{\zeta}.$ Explicit bounds on the rate of convergence in terms of the Lipchitz and bounding constants are given in various works [40, 9, 7]. We refer to [24] for a survey of variants and an explanation of the relationship between Kantorovich’s theorem and $\alpha$ -theory.

Approximate solutions may also be understood within the general program of interval and ball arithmetics. Both paradigms rely on defining arithmetic operations on intervals or balls and are definable in either exact or floating point arithmetic. In general, operations on intervals represent enclosures. In exact interval arithmetic, we define the sum by $[a,b]+[c,d]=[a+c,b+d]$ . For floating point arithmetic, we may either accept a soft certificate or control rounding errors when defining arithmetic operations so as to obtain a rigorous certificate. We refer to [22, 29, 41] for a more comprehensive treatment of these notions. A variety of interval/ball-valued Newton iterations have been studied. A popular variant is the Krawcyzk Method—see [29, Chapter 6] for an introduction, [28] for quadratic convergence, and [4] for extensions to complex analytic functions. Once a Newton-like iteration is in place, we get criteria for approximate solutions in the sense of definitions 2.1 and 2.5 by taking $\hat{\zeta}$ to be the midpoint/center of the enclosing interval/ball.

3. Certification via nonsolutions

In this section we consider certification in the setting where we have an overdetermined system given by $f_{1},\ldots,f_{N}\in\mathbb{C}[x_{1},\ldots,x_{n}],$ a full set of approximate solutions to some square subsystem $g,$ and prior knowledge of an integer $d$ such that

[TABLE]

From this information, the Newton operator $N_{g}$ can be used to give an approximate solution to $f$ in the sense of Definition 2.1. We make this precise in Section 3.1; Algorithm 1 provides one possible solution to Problem 1 from the introduction. For Problem 2, we give an essentially different approach (Algorithm 2) that assumes knowledge of the number of solutions to $f$ .

Knowledge of $d$ may come from rigorous mathematical proof (eg. the examples in 5.2) or by some form of certified computation. If all points in $\mathcal{V}(g)$ are isolated, then $d$ is simply the degree of the saturated ideal $\langle g\rangle:\langle f\rangle^{\infty}$ . Thus, if we can compute Gröbner bases for the ideals generated by both polynomial systems, then we have $d$ which is an admissible input for Algorithms 1 and 2, which are given in this section.

Aside from addressing Problems 1 and 2, we explain another approach to computing $d$ in the special case where $g$ is obtained by “squaring up”, or randomization [42]. This means we have a suitably generic $n\times N$ matrix $A\in\mathbb{C}^{n\times N}$ such that

[TABLE]

For such $g,$ the number $d$ can sometimes be computed from Khovanskii bases (a generalization of SAGBI bases) for the polynomial algebra in $n{+}1$ variables given by $\mathbb{C}[tf_{1},\ldots,tf_{N}]$ . We give an overview of this theory in Section 3.2. An attractive feature of Khovanskii bases is that we may, in principle, work with the algebra $\mathbb{C}[tf_{1},\ldots,tf_{N}]$ itself rather than some presentation $\mathbb{C}[x_{1},\ldots,x_{N}]\to\mathbb{C}[t\,f_{1},\ldots,t\,f_{N}]$ . There is, however, a significant trade-off, which is that finite Khovanskii bases need not exist. Nevertheless, we feel that computation of Khovanskii bases deserves to be more thoroughly explored. Certification is a particular application which may benefit from more efficient and robust computational tools for Khovanskii bases.

3.1. Certification algorithms

We now formulate our first algorithm for solving Problem 1. This is the content of Theorem 3.2, based on Definition 2.1. Note that Algorithms 1 and 2 assume pairwise distinct approximate solutions and explicit separating balls, respectively. By Proposition 2.6, it is enough to require that the $\hat{\zeta_{i}}$ are effective approximate zeros and apply these algorithms after refinement.

Algorithm 1 (Certifying individual solutions).

Input: $(f,g,d,S)$

$f$ * — a polynomial system*
$g$ * — a square subsystem of $f$ *
$d\in\mathbb{N}$ * satisfying (8)*
$S=\{\hat{\zeta_{1}},\ldots,\hat{\zeta_{m}}\}$ * — pairwise distinct approximate solutions to $g$ *

Output: $T\subset S,$ a set of approximate solutions to $f$

* Initialize $R\leftarrow\emptyset$ *

* ***for *** $j=1,\dotsc,m$ *** if *** $\delta(f,g,\hat{\zeta_{j}})>0$ *** then *** $R\leftarrow R\cup\{\zeta_{j}\}$ *

* ***if *** $(\#R==d)$ *** then *** $T\leftarrow S\smallsetminus R$ ***, else *** $T\leftarrow\emptyset$ *

* return $T$ *

Remark 3.1.

A priori, we only need to know that $d\geq\#\left(\mathcal{V}(g)\setminus\mathcal{V}(f)\right)$ —if the inequality is strict, we necessarily return an empty set.

Theorem 3.2.

Suppose that $f,g,d,S$ are valid input for Algorithm 1. Then its output consists of approximate solutions to $f$ .

Proof.

If $T$ is empty there is nothing to prove. Otherwise, there are $d$ distinct solutions to $g$ associated to points of $R$ —by Proposition 2.4, these are not solutions to $f$ . Since the solutions associated to points of $T$ are disjoint from those associated to points of $R$ , by assumption and (8) they associate to solutions to $f$ . ∎

We now give a second algorithm using $\alpha$ -theory to certify solutions to an overdetermined system $f$ to solve Problem 2. Suppose that we have an overdetermined system $f$ that is known to have $e$ solutions whose square subsystems are known to have $d$ solutions. While we could apply Algorithm 1 to certify approximate solutions to $f$ , we propose an alternative method to solve this problem.

Algorithm 2 (Certifying a set of solutions).

Input: $(d,e,f,g,g^{\prime},S,S^{\prime},B)$

$e\leq d$ * — integers*
$f$ * — a polynomial system with $e$ solutions*
$g,g^{\prime}$ * — two square subsystems of $f$ *
$S=\{\hat{\zeta_{1}},\ldots,\hat{\zeta_{d}}\}$ * — a set of $d$ distinct approximate solutions to $g$ *
$B=\{B(\hat{\zeta_{1}},\rho_{1}),\,\ldots,\,B(\hat{\zeta_{d}},\rho_{d})\}$ * — disjoint balls separating elements of $S$ .*
$S^{\prime}$ * — a set of $d$ distinct approximate solutions to $g^{\prime}$ *

Output: $T\subset S$ , a set of approximate solutions to $f$

1:* Initialize $T\leftarrow\emptyset$ *

2:* $r\leftarrow\displaystyle\min_{1\leq i<j\leq d}\,\Big{(}\|\hat{\zeta}_{i}-\hat{\zeta}_{j}\|-(\rho_{i}+\rho_{i})\,\Big{)}$ *

3:* for $\hat{\zeta}^{\prime}\in S^{\prime}$ do

4:* ***repeat *** $\hat{\zeta}^{\prime}\leftarrow N_{g^{\prime}}(\hat{\zeta}^{\prime})$ *** until *** $2\,\beta(g^{\prime},\hat{\zeta}^{\prime})<r/3$ *

5:* $\rho^{\prime}\leftarrow 2\,\beta(g^{\prime},\hat{\zeta}^{\prime})$ *

6:* ***for *** $j=1,\,\ldots,d$ *** if *** $B(\hat{\zeta_{j}},\rho_{j})\cap B^{\prime}(\hat{\zeta}^{\prime},\rho^{\prime})\neq\emptyset$ *** then *** $T\leftarrow T\cup\{\hat{\zeta_{j}}\}$ *

7:* end *for

8:* ***if *** $\big{(}\#T==e\,\big{)}$ , then ***return *** $T$ ***, else return *** FAIL *

Note that the intersection of balls in line 6 is non-empty if and only if

[TABLE]

so this condition may be decided in rational arithmetic if a hard certificate is desired.

Theorem 3.3.

Let $f$ be a system of polynomials having $e$ solutions whose general square subsystems have $d$ solutions. Then Algorithm 2 either returns FAIL or it returns a set $T$ of approximate solutions to $f$ whose associated solutions are all the solutions to $f$ .

As with Algorithm 1, while the hypotheses appear restrictive, they are natural from an intersection-theoretic perspective, and are satisfied by a large class of systems of equations. We explain one such family coming from Schubert calculus in Section 5.2.

Proof.

Since the balls $B(\hat{\zeta}_{i},\rho_{i})$ are pairwise disjoint, the quantity $r$ is positive. Thus the refinement of each approximate solution $\hat{\zeta}^{\prime}$ on line 4 terminates. Having refined each $\hat{\zeta}^{\prime}\in S^{\prime},$ note that $B(\hat{\zeta}^{\prime},\rho^{\prime})$ can intersect at most one ball from $B$ . Now, if $\zeta_{1},\ldots,\zeta_{e}$ are the solutions to $f$ , then we must have that some $\hat{\zeta_{i_{j}}}$ is associated to each $\zeta_{j}$ for some indices $1\leq i_{1}<i_{2}<\cdots<i_{e}\leq d$ . Thus, if $T$ has $e$ elements, then the only solutions to $g$ associated to $T$ are also solutions to $f$ . ∎

Remark 3.4.

If $g^{\prime}$ is a general square subsystem of $f$ , then it will have $d$ solutions and the only common solutions to $g$ and to $g^{\prime}$ are solutions to $f$ . In this case, if Algorithm 2 returns FAIL, then $\#T>e$ , so that some pair of balls in Step 6 meet, but their intersection does not contain a common solution to $g$ and to $g^{\prime}$ . In this case, we may then further refine the solutions in $S,S^{\prime}$ , and the corresponding balls until no such extraneous pair of balls meet.

3.2. Newton-Okounkov bodies and Khovanskii bases

Perhaps the main difficulty in applying Algorithm 1 is obtaining the correct number $d$ beforehand. As noted in the beginning of this section, Gröbner bases give a general recipe for calculating this number. Here, we sketch a less well-developed approach in the case of a square subsystem defined as in (9). In this case, $d$ is given by a birationally-invariant intersection index over $\mathbb{C}^{n}.$ We summarize the basic tenets of this theory as developed in [17, 18].

Definition 3.5.

([18, Def. 4.5]) Let $X$ be an $n$ -dimensional irreducible variety over $\mathbb{C}$ with singular locus $X_{sing}$ . For an $n$ -tuple $(L_{1},L_{2},\dotsc,L_{n})$ of finite-dimensional complex subspaces of the function field $\mathbb{C}(X)$ , let $\boldsymbol{L}=L_{1}\times L_{2}\times\cdots\times L_{n}$ , and define

[TABLE]

the set of smooth points where every function in each subspace $L_{i}$ is regular, and

[TABLE]

the set of basepoints of $\boldsymbol{L}$ . For generic $g=(g_{1},\dotsc,g_{n})\in\boldsymbol{L}$ , all solutions to the system $g_{1}(z)=\cdots=g_{n}(z)=0$ on $U_{\textbf{L}}\smallsetminus Z_{\textbf{L}}$ are nonsingular and their number is independent of the choice of $g$ . The common number is the birationally invariant intersection index $[L_{1},L_{2},\ldots,L_{n}]$ .

These claims are proven in [17, Sections 4 & 5]. For our purposes, $X=\mathbb{C}^{n}$ and $\boldsymbol{L}=L\times\cdots\times L$ where $L\subset\mathbb{C}[z_{1},\ldots,z_{n}]$ is the linear space spanned by the polynomials in our system $f$ . Write $d_{L}$ for this self-intersection index, note that $U_{\boldsymbol{L}}=\mathbb{C}^{n}$ , while $Z_{\boldsymbol{L}}=\mathcal{V}(f)$ . Thus (8) holds for general square subsystems of $f$ , taking $d=d_{L}$ .

Let $\nu\colon\mathbb{C}(X)^{\times}\to(\mathbb{Z}^{n},\prec)$ be a surjective valuation where $\prec$ is some fixed total order on $\mathbb{Z}^{n}$ . For example, $\nu$ could restrict to the exponent of the leading monomial in a term order $\prec$ on $\mathbb{C}[x_{1},\dotsc,x_{n}]$ . We attach to $(L,\nu)$ the following data:

•

$A_{L}=\displaystyle\bigoplus_{k=0}^{\infty}t^{k}L^{k}$ —a graded subalgebra of $\mathbb{C}(X)[t].$

•

$S(A_{L},\nu)=\{(\nu(f),k)\mid f\in L^{k}\text{ for some }k\in\mathbb{N}\}$ , a sub-monoid of $\mathbb{Z}^{n}\oplus\mathbb{N}$ associated to the pair $(L,\nu),$ where $L^{k}$ is the $\mathbb{C}$ -span of $k$ -fold products from $L$ . This is the initial algebra of $A_{L}$ with respect to the extended valuation $\nu_{t}\colon\mathbb{C}(X)(t)^{\times}\to(\mathbb{Z}^{n}\oplus\mathbb{Z},\prec_{t})$ defined by $\nu_{t}(f_{k}\,t^{k}+\cdots+f_{0})\mapsto\left(\nu(f_{k}),k\right),$ where $\prec_{t}$ is the levelwise order defined by

[TABLE]

•

$\operatorname{\rm ind}(A_{L},\nu)$ —the index of $\mathbb{Z}\,S(A_{L},\nu)\cap\left(\mathbb{Z}^{n}\times\{0\}\right)$ as a subgroup of $\mathbb{Z}^{n}\times\{0\}$ .

•

$\overline{\operatorname{\rm Cone}(A_{L},\nu)}$ —the Euclidean closure of all $\mathbb{R}_{\geq 0}$ -linear combinations from $S(A_{L},\nu).$

•

$\Delta(A_{L},\nu)=\overline{\operatorname{\rm Cone}(A_{L},\nu)}\cap(\mathbb{R}^{n}\times\{1\})$ —the Newton-Okounkov body.

The linear space $L$ induces a rational Kodaira map

[TABLE]

with the section ring $A_{L}$ the projective coordinate ring of the image.

Proposition 3.6 ([18, Thm. 4.9]).

Let $L$ be a finite-dimensional subspace of $\mathbb{C}(X)$ . Then

[TABLE]

Here, $\operatorname{\rm Vol}$ denotes the $n$ -dimensional Euclidean volume in the slice $\mathbb{R}^{n}\times\{1\}$ .

In our setting, where $X=\mathbb{C}^{n}$ and $L=\operatorname{\rm span}_{\mathbb{C}}\{f_{1},\ldots,f_{N}\},$ the Kodaira map $\Psi_{L}$ is $z\mapsto[f_{1}(z):f_{2}(z):\cdots:f_{N}(z)]$ . Thus, if need be, $\deg\Psi_{L}$ may be computed symbolically. The main difficulty in applying Proposition 3.6 is that it may be hard to determine the Newton-Okounkov body, as the monoid $S(A_{L},\nu)$ need not be finitely generated. This leads us to the notion of a finite Khovanskii basis [16].

Definition 3.7.

A Khovanskii basis for $(L,\nu)$ is a set $\{a_{i}\mid i\in I\}$ of generators for the algebra $A_{L}$ whose values $\{\nu_{t}(a_{i})\mid i\in I\}$ generate the monoid $S(A_{L},\nu)$ . If $<$ is a global monomial order on $k[z_{1},\ldots,z_{n}],$ taking lead monomials defines a valuation $\nu\colon k[z_{1},\ldots,z_{n}]\to(\mathbb{Z}^{n},\prec)$ , where $\prec$ is the reverse of $<$ . A Khovanskii basis with respect this valuation is commonly known as a SAGBI basis [15, 31].

When the monoid $S(A_{L},\nu)$ is finitely generated, there is a finite Khovanskii basis for $(L,\nu)$ . When this occurs, we may compute the Khovanskii basis via a binomial-lifting/subduction algorithm such as described in [31] or [39, Ch. 11].

Example 3.8.

We consider an “illustrative example” of an overdetermined system from [1]:

[TABLE]

The square subsystem defiened by $f_{1}=f_{2}=f_{3}=0$ has two singular solutions, and $f_{4}$ is the Jacobian determinant of this subsystem. Let $<$ be the graded reverse lexicographic ordering with $z_{1}>z_{2}>z_{3}$ and $L=\operatorname{\rm span}_{\mathbb{C}}\{f_{1},f_{2},f_{3},f_{4}\}.$ We observe that the initial terms of $tf_{1},\ldots,tf_{4}\in A_{L}$ under the induced order $<_{t}$ are given by $t\,{z}_{1}^{2},\,{-{16}\,t\,{z}_{2}^{2}},\,{-t\,{z}_{2}^{2}},$ and $64t\,{z}_{1}{z}_{2}.$ The lattice points corresponding to these monomials comprise the first level $S(A_{L},\nu)\cap(x_{4}=1)$ and lie in the linear subspace of $\mathbb{R}^{3}\times\{1\}$ defined by $x_{3}=0.$ We see that the inner approximation to the Newton-Okounkov body $\Delta(A_{L},\nu)$ given by the first level has $3$ -dimensional volume $0.$ However, there exists $x\in S(A_{L},\nu)$ with $x_{3}\neq 0$ :

[TABLE]

giving

[TABLE]

This element of $A_{L}$ was obtained by the previously mentioned binomial-lifting/subduction algorithm—carrying this out further, we can verify that this new element together with the original generators give a finite Khovanskii basis for $A_{L}.$ It follows that

[TABLE]

and $\operatorname{\rm Vol}(A_{L})=1.$ We also have that $\deg\Psi_{L}=2$ and $\operatorname{\rm ind}(A_{L})=1.$ Thus, we have $d_{L}=2,$ giving a total root count of $4$ after squaring up $f.$

Remark 3.9.

We note that the total root count for a system $g=(g_{1},g_{2},g_{3})$ obtained by randomizing $f$ in the previous example is equal to the normalized volume of the common Newton polytope of $g_{1},g_{2},g_{3}$ , which is four. This is equal to the “expected” polyhedral root count [23, 2]. However, this information does not give us $d_{L}=2$ as needed for Algorithms 1 and 2.

4. Certification via liaison pruning

Suppose that we have an overdetermined system $f$ with a square subsystem $g$ , so that $\mathcal{V}(f)\subset\mathcal{V}(g)$ . Suppose further that we have a square system $h$ with $\mathcal{V}(h)=\mathcal{V}(g)\smallsetminus\mathcal{V}(f)$ . Given this, we may certify all approximate solutions to $g$ and then certify the subset of those that are approximate solutions to $h$ , so that the solutions in $\mathcal{V}(g)\smallsetminus\mathcal{V}(h)$ which remain are certifiably approximate solutions to $f$ . This solves Problem 1. When this occurs, we say that $\mathcal{V}(f)$ is in liaison with the complete intersection $\mathcal{V}(h)$ . The basic scheme for certification via liaison is Algorithm 3. We also give a generalized version, Algorithm 4, which we later apply to the Schubert calculus in Section 5.2.

Let us begin with some definitions. A system $g_{1},\dotsc,g_{r}$ of $r$ polynomials is a complete intersection if the variety $\mathcal{V}(g_{1},\dotsc,g_{r})\subset\mathbb{C}^{n}$ they define has dimension $n-r$ , equivalently if it has codimension $r$ . A square system is a zero-dimensional complete intersection.

More generally, varieties $X,Y\subset\mathbb{C}^{n}$ of codimension $r$ are in liaison if there are polynomials $g_{1},\dotsc,g_{r}\in\mathbb{C}[x_{1},\dotsc,x_{n}]$ such that $\mathcal{V}(g_{1},\dotsc,g_{r})=X\cup Y$ . This relation has been deeply studied (see [20] and the references therein). Of particular interest is when one of the varieties, say $Y$ , is itself a (different) complete intersection, so that $X$ is in liaison with a complete intersection. (This is a special case of the licci equivalence relation.)

Example 4.1.

(The twisted cubic.) The closure of the set $\{[1,t,t^{2},t^{3}]\mid t\in\mathbb{C}\}$ is the rational normal curve $C\subset\mathbb{P}^{3}$ . It is defined by three quadrics, $wy-x^{2},wz-xy,xz-y^{2}$ , and is thus not a complete intersection. In the affine patch $\mathbb{C}^{3}$ defined by $w=1$ , if we use the difference of the first two generators and the last generator, then $\mathcal{V}(z-y+x^{2}-xy,xz-y^{2})=C\cup\ell$ , where $\ell=\mathcal{V}(x-y,x-z)$ is the line $\{(t,t,t)\mid t\in\mathbb{C}\}$ .

[TABLE]

Let $X\subset\mathbb{C}^{n}$ be a variety of codimension $r$ that is in liaison with a complete intersection $Y$ . There are polynomials $f=(f_{1},\dotsc,f_{s})$ , $g=(g_{1},\dotsc,g_{r})$ , and $h=(h_{1},\dotsc,h_{r})$ such that

[TABLE]

We generalize the notion of a square system of polynomials. A square system on $X$ consists of polynomials $g_{r+1},\dotsc,g_{n}$ that are sufficiently general in that $X\cap\mathcal{V}(g_{r+1},\dotsc,g_{n})$ is a finite set and the intersection is transverse. Then

[TABLE]

Thus the square system $X\cap\mathcal{V}(g_{r+1},\dotsc,g_{n})$ on $X$ is the set-theoretic difference of two square systems of polynomials, $\mathcal{V}(g_{1},\dotsc,g_{n})$ and

[TABLE]

For example, let $C$ be the rational normal curve of Example 4.1 in $\mathbb{C}^{3}$ , which has codimension 2, so that $C\cap\mathcal{V}(x+y+z+1)$ is a square system on $C$ . Manipulating the polynomials in $\mathcal{V}(z-y+x^{2}-xy,xz-y^{2},x+y+z+1)$ leads to the solutions

[TABLE]

If $f_{1},\dotsc,f_{s}$ generate the ideal of $X$ , then $X\cap\mathcal{V}(g_{r+1},\dotsc,g_{n})$ is the overdetermined system

[TABLE]

Thus an algorithm to certify points on $X\cap\mathcal{V}(g_{r+1},\dotsc,g_{n})$ solves Problem 1 for $f$ . As we may certifty solutions and nonsolutions to systems (10) and (11), this discussion leads to the following certification algorithm, when a variety $X$ is in liaison with a complete intersection $Y$ . This uses the test of Proposition 2.4, the Taylor residual (5), and Smale’s $\alpha$ -theory for the system (11).

Algorithm 3 (Certifying approximate solutions to a square system on a variety $X$ ).

Input: $(r,g,h,S)$

$r\in\mathbb{N}$ **
$g=(g_{1},\dotsc,g_{n})$ * — a square polynomial system such that $\mathcal{V}(g_{1},\dotsc,g_{r})=X\cup Y$ , ** *with both $X$ and $Y$ of codimension $r$
$h=(h_{1},\dotsc,h_{r})$ * — polynomials such that $\mathcal{V}(h)=Y$ *
$S=\{\hat{\zeta_{1}},\ldots,\hat{\zeta_{m}}\}$ * — pairwise distinct approximate solutions to $g$ with refinement*

* ** operator $\mathcal{N}_{g}$ *

Output: $T,U\subset S$ with $S=T\sqcup U$ , where $T$ consists of approximate solutions to $X\cap\mathcal{V}(g_{r+1},\dotsc,g_{n})$ and $U$ consists of approximate solutions to $Y\cap\mathcal{V}(g_{r+1},\dotsc,g_{n})$ .

1:* Set ${f}:=(h_{1},\dotsc,h_{r}\,,\,g_{r+1},\dotsc,g_{n})$ , a square system on $Y$ . *

2:* Initialize $T\leftarrow\emptyset$ , $U\leftarrow\emptyset$ *

3:* for $\hat{\zeta}\in S$ do

4:* $\zeta^{\prime}\leftarrow\hat{\zeta}$ *

5:* *** if *** $\alpha(f,\zeta^{\prime})<\frac{13-3\sqrt{17}}{4}$ *** then *** $U\leftarrow U\cup\{\hat{\zeta}\}$ *

6:* *** else if *** $\delta(f,g,\zeta^{\prime})>0$ *** then *** $T\leftarrow T\cup\{\hat{\zeta}\}$ *

7:* *** else *** $\zeta^{\prime}\leftarrow\mathcal{N}_{g}(\zeta^{\prime})$ and return to 5. *

8:* *** end if *** *

9:* end *for

Remark 4.2.

As in all subsequent algorithms, we assume distinct approximate solutions to $g$ with refinement operator $\mathcal{N}_{g}$ as part of the input. We could have just as easily assumed effective approximate solutions. The test in line 5 could be replaced by testing that $\zeta^{\prime}$ is an approximate solution to the square system $f$ by some criterion other than $\alpha$ -theory—for simplicity, we do not assume this criterion is part of the input.

Proof of correctness.

As $\hat{\zeta}\in S$ , it is an approximate solution to the square system $g$ with an associated nonsingular solution $\zeta\in\mathcal{V}(g)\subset X\cup Y$ . Since $\zeta$ is nonsingular, $\zeta\not\in X\cap Y$ , as $X\cup Y$ is singular along $X\cap Y$ . Thus $\zeta\in X$ if and only if $\zeta\not\in Y$ . Let $\{\hat{\zeta}_{i}\mid i\in\mathbb{N}\}$ be the sequence of iterates using $\mathcal{N}_{g}$ starting at $\hat{\zeta}$ . This converges to $\zeta$ .

If $\zeta\in Y$ , then $\zeta\in\mathcal{V}(f)$ , and the sequence $\{\hat{\zeta}_{i}\}$ will eventually lie in the basin of quadratic convergence for Newton iterations $N_{f}$ and $\beta(f,\hat{\zeta}_{i})$ converges to [math]. As $\gamma(f,\hat{\zeta}_{i})$ is bounded, $\alpha(f,\hat{\zeta}_{i})=\gamma(f,\hat{\zeta}_{i})\cdot\beta(f,\hat{\zeta}_{i})$ converges to [math]. Thus the condition in line 5 will eventually hold and $\hat{\zeta}$ will be placed in $U$ .

If $\zeta\not\in Y$ , then $\zeta\not\in\mathcal{V}(f)$ . By Corollary 2.7, the Taylor residuals $\delta(f,g,\zeta_{j})$ are positive for $j$ large enough. Thus the condition in line 6 eventually holds, and $\hat{\zeta}$ will be placed into $T$ . ∎

We describe a more involved application of this idea. Write ${\operatorname{{\rm codim}}X}$ for the codimension, $n{-}\dim X$ , of a variety $X\subset\mathbb{C}^{n}$ . Suppose that $X_{1},\dotsc,X_{m}\subset\mathbb{C}^{n}$ are in general position and $\sum\operatorname{{\rm codim}}X_{i}=n$ , then Bertini’s Theorem [19] implies that

[TABLE]

is a transverse intersection consisting of finitely many points. When $n=m$ , so that each $X_{i}=\mathcal{V}(f_{i})$ is a hypersurface, then (12) is equivalent to the square polynomial system

[TABLE]

As a variety need not be a complete intersection, a square system of varieties (12) with $m<n$ does not necessarily have a formulation as a square system of polynomials. However, the points of (12) are the solutions to an overdetermined system of polynomials given by the generators of the ideals of each of $X_{1},\dotsc,X_{m}$ .

Suppose now that $X_{1},\dotsc,X_{m}\subset\mathbb{C}^{n}$ form a square system of varieties (12), each $X_{i}$ is in liaison with a complete intersection $Y_{i}$ , and these are all in sufficiently general position. Then there are square systems $g_{1},\dotsc,g_{n}$ and $h_{1},\dotsc,h_{n}$ of polynomials such that if ${a_{\bullet}}\colon 0=a_{0}<a_{1}<\dotsb<a_{m}=n$ is defined by $a_{i}-a_{i-1}=\operatorname{{\rm codim}}X_{i}(=\operatorname{{\rm codim}}Y_{i})$ for each $i$ , then

[TABLE]

are complete intersections for each $i=1,\dotsc,m$ . Thus

[TABLE]

We give a more general version of Algorithm 3 that will certify solutions to the square system (12) of varieties, given solutions (14) to the square system $g$ .

Algorithm 4 (Certifying solutions to a square system of varieties).

Input: $(a_{\bullet},g,h,S)$

$a_{\bullet}\colon 0=a_{0}<a_{1}<\dotsb<a_{m}=n$ **
$g=(g_{1},\dotsc,g_{n})$ * and $h=(h_{1},\dotsc,h_{n})$ — square polynomial systems such that*

* ** for each $i=1,\dotsc,m$ , (13) are complete intersections.*

$S=\{\hat{\zeta_{1}},\ldots,\hat{\zeta_{s}}\}$ * — pairwise distinct approximate solutions to $g$ *

Output: $T\subset S$ consisting of approximate solutions to $X_{1}\cap X_{2}\cap\dotsb\cap X_{m}$ .

1:* for $i=1,\dotsc,m$ do

2:* Set $f:=(g_{1},\dotsc,g_{a_{i-1}}\,,\,h_{1+a_{i-1}},\dotsc,h_{a_{i}}\,,\,g_{1+a_{i}},\dotsc,g_{n})$ . *

3:* Initialize $T\leftarrow\emptyset$ *

4:* for $\hat{\zeta}\in S$ do

5:* $\zeta^{\prime}\leftarrow\hat{\zeta}$ *

6:* *** if *** $\alpha(f,\zeta^{\prime})<\frac{13-3\sqrt{17}}{4}$ *** then *** discard $\hat{\zeta}$ *

7:* *** else if *** $\delta(f,g,\zeta^{\prime})>0$ *** then *** $T\leftarrow T\cup\{\hat{\zeta}\}$ *

8:* *** else *** $\zeta^{\prime}\leftarrow\mathcal{N}_{g}(\zeta^{\prime})$ and return to 6. *

9:* *** end if *** *

10:* end *for

11:* $S\leftarrow T$ *

12:* end *for

Proof of correctness.

By algorithm 3, in each iteration $i=1,\dotsc,m$ of the outer loop, the algorithm constructs the set $T$ of elements of the input $S$ that do not lie in $Y_{1}\cup\dotsb\cup Y_{i}$ . As $S\cap X_{1}\cap\dotsb\cap X_{m}=S\smallsetminus(Y_{1}\cup\dotsb\cup Y_{m})$ , we see that the algorithm performs as claimed. ∎

5. Examples

We give three further examples that illustrate our certification via square subsystems. All computations were carried out using the computer algebra system Macaulay2 [10]. For each example, we found complex floating-point solutions to square subsystems via homotopy continuation, as implemented in the package NumericalAlgebraicGeometry [26]. Tests from $\alpha$ -theory were supplied by the package NumericalCertification [25].

5.1. Plane quartics through four points

Consider the overdetermined system $f=(f_{1},\dotsc,f_{11}),$ where the the $f_{i}$ are given as follows:

[TABLE]

These give a basis for the space of quartics passing through the four points:

[TABLE]

As an illustration of Algorithm 1 and the techniques based on Khovanskii bases described in Section 3.2, we show how to certify that numerical approximations of these points represent true solutions to $f$ .

Letting $L=\operatorname{\rm span}_{\mathbb{C}}\{f_{1},\ldots,f_{11}\},$ we consider the algebra $A_{L}.$ Letting $<$ be the graded-reverse lex order with $z_{1}>z_{2},$ the algebra $A_{L}$ has a finite Khovanskii basis with respect to the $\mathbb{Z}^{2}$ -valuation associated to $<$ . It is given by $S=\{t\,f_{1},t\,f_{2},\ldots,t\,f_{11},t^{2}\,g,t^{3}\,h\},$ where

[TABLE]

The Newton-Okounkov body, depicted below, has normalized volume $12$ . The integer points correspond to $f_{1},\dotsc,f_{11}$ . The fractional vertices corresponding to $t^{2}g$ and $t^{3}h$ demonstrate that these elements are essential in forming the Khovanskii basis.

Using the procedure of [33], we may express $g$ and $h$ as homogeneous polynomials in the algebra generators $f_{1},\ldots,f_{11}$ :

[TABLE]

The Khovanskii basis was computed using the Macaulay2 package SubalgebraBases, based on the work in [38]. We checked this computation against our own top-level implementation of the binomial-lifting / subduction algorithm.

For certification, we squared up $f$ with a random matrix, $g=Af$ , and found $16$ complex approximate solutions to $g$ using homotopy continuation. Each solution was softly certified distinct via $\alpha$ -theory. Computing values $\delta(f,g,\cdot)$ as in Algorithm 1, we softly certified $12$ of these as nonsolutions to $f$ , hence associating the four remaining solutions to $f$ . Observe that $d_{L}=12\,\deg\Psi_{L}$ by Proposition 3.6. Also, we have $d_{L}\leq 16$ by Bézout’s theorem. This implies that $\deg\Psi_{L}=1$ and hence $d_{L}=12.$

5.2. Example from Schubert calculus

We describe a family of examples from Schubert calculus to which Algorithms 1, 2, and 4 all apply. For more on the Grassmannian and Schubert calculus, see [8]. Let $m\geq 2$ be an integer and set $n:=m{+}2$ . Consider the geometric problem of the 2-planes $H$ in $\mathbb{C}^{n}$ that meet $m$ general codimension $3$ planes nontrivially. The number of such 2-planes is the Kostka number $K_{m^{2},2^{m}}$ , the first few values of which are shown below.

[TABLE]

This may be computed recursively. Let $\kappa_{m,i}$ be the coefficient of the Schur function $S_{(m+i,m-i)}$ in the product $(S_{(2,0)})^{m}$ . Then $K_{m^{2},2^{m}}=\kappa_{m,0}$ . For the recursion, set $\kappa_{1,1}:=1$ and $\kappa_{1,0}=\kappa_{m,j}:=0$ , when $j>m$ . Then, for $m>1$ , we set $\kappa_{m,0}:=\kappa_{m-1,1}$ and for $j>0$ , $\kappa_{m,j}:=\kappa_{m-1,j-1}+\kappa_{m-1,j}+\kappa_{m-1,j+1}$ .

We express this geometric probem in local coordinates. Write $I_{2}$ for the $2\times 2$ identity matrix and let $Z$ be a $2\times m$ matrix of indeterminates, and set ${H}:=(Z|I_{2})^{\top}$ , which has $n$ rows and $2$ columns. For any choice of $Z\in\mbox{Mat}_{2\times m}(\mathbb{C})$ , the column span of $H$ , also written $H$ , is a 2-plane in $\mathbb{C}^{n}=\mathbb{C}^{m}\oplus\mathbb{C}^{2}$ that does not meet the coordinate plane $\mathbb{C}^{m}\oplus\{0\}$ , and $\mbox{Mat}_{2\times m}(\mathbb{C})$ parametrizes the set of such $2$ -planes. For $k=1,\dotsc,m$ , let $K_{k}$ be a general $n\times(m{-}1)$ -matrix whose column span (also written $K_{k}$ ) is a general $(m{-}1)$ -plane. Then $\dim H\cap K_{k}\geq 1$ if and only if the matrix $(H|K_{k})$ has rank at most $m$ . This condition is given by the $n$ maximal minors $f_{k,1},\dotsc,f_{k,n}$ of $(H|K_{k})$ , each of which is the determinant of the square $(n{-}1)\times(n{-}1)$ -matrix obtained by deleting a row from $(H|K_{k})$ . This gives a system ${f}=(f_{k,j}\mid k=1,\dotsc,m\mbox{ and }j=1,\dotsc,n)$ of $mn$ quadratic equations in $2m$ variables which define the solutions to our geometric problem.

Any polynomial $g$ that is a linear combination of the $f_{k,j}$ has the form $g=\det(H|K_{k}|\ell)$ , where the entries of $\ell$ are the coefficients of $(-1)^{j}f_{k,j}$ in that linear combination. This justifies the following scheme to obtain a square subsystem of $f$ . For each $k=1,\dotsc,m$ and $i=1,2$ , let $L_{k,i}\supset K_{k}$ be an $m$ -plane that is general given that it contains $K_{k}$ . We obtain the matrix of $L_{k,i}$ by appending a general column vector to the matrix of $K_{k}$ . Let $g_{k,i}$ be the determinant of the matrix $(H|L_{k,i})$ —this vanishes when $\dim H\cap L_{k,i}\geq 1$ . We claim that the susbsystem $g=(g_{1,1},g_{1,2},\dotsc,g_{m,1},g_{m,2})$ of $f$ is square.

For this, let us investigate the corresponding geometric loci in the Grassmannian $G(2,n)$ . Write $\Omega_{\includegraphics{./paper_bottom/pictures/s2}}K_{k}$ for the set of all 2-planes which meet $K_{k}$ nontrivially, and $\Omega_{\includegraphics{./paper_bottom/pictures/s1}}L_{k,i}$ for those that meet $L_{k,i}$ nontrivially. Let $\Lambda_{k}$ be the hyperplane containing both $L_{k,1}$ and $L_{k,2}$ , and let $\Omega_{\includegraphics{./paper_bottom/pictures/s11}}\Lambda_{k}$ be the set of all 2-planes that are contained in $\Lambda_{k}$ . Since $L_{k,1}\cap L_{k,2}=K_{k}$ and $L_{k,1}+L_{k,2}=\Lambda_{k}$ it was shown in [37] that

[TABLE]

is a (generically) transverse intersection.

It is natural to analyze this geometric problem in the context of liaison theory discussed in Section 4: specifically, we can use Algorithm 4. On the other hand, the algorithms from section 3 work just as well. For each approach, we explain the details needed in order to certify. We note that the main bottleneck, solving $g,$ is well within the capabilities of modern homotopy continuation software, say, for $m$ in the single digits [27].

Algorithm 4. In the local coordinates $H=(Z|I_{2})^{\top}$ , we have that $\Omega_{\includegraphics{./paper_bottom/pictures/s1}}L_{k,i}=\mathcal{V}(g_{k,i})$ , so that (15) is a complete intersection and $\Omega_{\includegraphics{./paper_bottom/pictures/s2}}K_{k}=\mathcal{V}(f_{k,1},\dotsc,f_{k,n})$ is in liaison with $\Omega_{\includegraphics{./paper_bottom/pictures/s11}}\Lambda_{k}$ , which we show is a complete intersection. Let $\lambda_{k}$ be the linear form (a row vector) whose kernel is $\Lambda_{k}$ . Then $H\in\Omega_{\includegraphics{./paper_bottom/pictures/s11}}\Lambda_{k}$ if and only if $H\subset\Lambda_{k}$ , so that $\lambda_{k}H=\left(\begin{smallmatrix}0\\ 0\end{smallmatrix}\right)$ . If $h_{k,1}$ and $h_{k,2}$ are the two rows of $\lambda_{k}H$ , then $\Omega_{\includegraphics{./paper_bottom/pictures/s11}}\Lambda_{k}=\mathcal{V}(h_{k,1},h_{k,2})$ , showing that it is a complete intersection.

Our geometric problem of the 2-planes $H$ that meet each of $K_{1},\dotsc,K_{m}$ is equivalent to the intersection

[TABLE]

which is a square system of varieties (12). As each is in liaison with a complete intersection, Algorithm 4 applies and may be used to certify the solutions to our geometric problem. Its input is the set $\mathcal{V}(g)$ , which consists of the points in the intersection

[TABLE]

While each pair $\Omega_{\includegraphics{./paper_bottom/pictures/s1}}L_{k,1}\bigcap\Omega_{\includegraphics{./paper_bottom/pictures/s1}}L_{k,2}$ is not in general position, this intersection is generically transverse, and the different pairs are in general position, so the intersection (16) is transverse. Consequently, the number of points in the intersection (16) is the expected number, which is the Catalan number $C_{m}:=\frac{1}{m+1}\binom{2m}{m}$ .

Algorithm 1. For this algorithm, the number $d$ of excess solutions is $\frac{1}{m+1}\binom{2m}{m}-K_{m^{2},2^{m}}$ . It starts with the set $S=\mathcal{V}(g)$ of $\frac{1}{m+1}\binom{2m}{m}$ points in the intersection (16). We run Algorithm 1, and if it finds that $\#R=d$ , so that we have rejected all nonsolutions, then those that remain are certified solutions to our geometric problem $\mathcal{V}(f)$ . Otherwise, we may refine the approximate solutions $\hat{\zeta}$ in $S$ so that the Newton steps $\beta(g,\hat{\zeta})$ become small enough to reject $d$ nonsolutions.

This algorithm is particularly easy in this case as the Taylor residual (5) of a linear function $\phi$ is $|\phi(\hat{\zeta})|-\|\phi^{\prime}\|\delta$ , where the derivative $\phi^{\prime}$ of $\phi$ is a vector.

Algorithm 2. Here, we simply observe that $d=\frac{1}{m+1}\binom{2m}{m}$ is the number of solutions to the square system $g$ and $e=K_{m^{2},2^{m}}$ is the number of solutions to $f$ . These data together with a full set of approximate solutions to $g,$ are all that is needed for certification.

5.3. Essential matrix estimation

A fundamental object of study in geometric computer vision is the essential variety

[TABLE]

This is an irreducible variety of dimension $5$ and degree $10.$ Elements of $V_{\operatorname{{\it ess}}}$ are called essential matrices. The ten polynomials defining $V_{\operatorname{{\it ess}}}$ are known as the Demazure cubics [6]. They minimally generate the homogeneous ideal of $V_{\operatorname{{\it ess}}}$ . It is possible to recover an essential matrix given five generic point correspondence constraints of the form

[TABLE]

where $x_{1},\ldots,x_{5},y_{1},\ldots,y_{5}\in\mathbb{P}(\mathbb{C}^{3}).$ Although the overdetermined family given by equations (17) and (18) is fairly simple, it is notable for its apperance in applications. State of the art algorithms for solving these equations run on the order of microseconds [30] and have been successfully employed in large-scale 3D reconstruction pipelines [36]. This motivates the problem of developing certification techniques for this problem with comparable efficiency. For concreteness, we consider an instance of this problem in which the data are given by

[TABLE]

Truncating to $6$ decimal places, we may regard each of the $x_{i}$ and $y_{i}$ as rational vectors and seek a certificate. A candidate approximate solution is given to 6 places by

[TABLE]

To certify $\hat{E},$ we consider the square system $g$ given by equations (18), the chart $e_{1,1}=1,$ and the first three Demazure cubics: namely

[TABLE]

The number of solutions to $g$ is bounded a priori by $27,$ and we can easily certify that this bound is attained. It follows that we may apply the methods described in Section 3. However, there is an even simpler procedure based on the exclusion criteria of Proposition 2.4 and 2.7, as well as the following result obtained by symbolic computation.

Proposition 5.1.

Let $V_{sq}$ be the subvariety of $\mathbb{P}(\mathbb{C}^{3\times 3})$ defined by equations (18) and (19). Consider the polynomials

$f_{1}:=e_{1,1}$ **
$f_{2}:={e}_{1,1}{e}_{3,1}+{e}_{1,2}{e}_{3,2}+{e}_{1,3}{e}_{3,3}$ **
$f_{3}:={e}_{1,1}^{2}+{e}_{1,2}^{2}+{e}_{1,3}^{2}$ **

and define projective varieties as the Zariski closures of the indicated quasiprojective varieties,

$V_{1}:=\overline{V_{sq}\setminus\mathcal{V}(f_{1})}$ **
$V_{2}:=\overline{V_{1}\setminus\mathcal{V}(f_{2})}$ **
$V_{3}:=\overline{V_{2}\setminus\mathcal{V}(f_{3})}$ **

We have that $V_{3}=V_{\operatorname{{\it ess}}}.$

Proof.

This follows by symbolically computing ideal quotients: letting $I_{0}$ by the ideal generated by (18) and (19) and $I_{k}=I_{k-1}:f_{k}$ for $k=1,2,3,$ we may establish that $I_{3}$ equals the ideal defining $V_{\operatorname{{\it ess}}}$ ; for instance, by showing they have the same reduced Gröbner basis for a given term order. This is easily accomplished by Macaulay2. ∎

For generic data $(x_{i},y_{i}),$ we expect the exclusion criteria of Proposition 2.4 and 2.7 for $f_{1},f_{2}$ and $f_{3}$ to be satisfied for candidate solutions to the overdetermined problem: we may easily verify this in rational arithmetic for our given $\hat{E}$ . Moreover, we estimate that $\alpha(g,\hat{E})\approx.00059,$ thus giving that $\hat{E}$ is an approximate solution using the refinement operator $N_{g}.$

We remark that $\alpha$ -theory does not furnish a certificate if we use the Newton fixed point system given by equation 3 in the introduction. For this system, whose defining equations are of higher degree, we estimate that $\alpha(g,\hat{E})\approx 9.23.$ We also found, using Gröbner bases over a finite field, that the number of excess solutions to this fixed point system is $24,$ for a total of $34$ critical points overall for the least-squares Newton operator. By constrast, Proposition 5.1 furnishes a certificate that requires no excess solutions whatsoever. Thus, even for this toy example, we see how the flexibility of square subsystems may enhance the prospects of obtaining rigorous mathematical proofs from the output of numerical computations. Although certification for overdetermined systems remains a challenge in general, similar techniques may be worth considering for problems of a larger scale.

Bibliography42

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Tulay Ayyildiz Akoglu, Jonathan D. Hauenstein, and Agnes Szanto. Certifying solutions to overdetermined and singular polynomial systems over ℚ ℚ {\mathbb{Q}} . J. Symbolic Comput. , 84:147–171, 2018.
2[2] David N. Bernshtein. The number of roots of a system of equations. Functional Analysis and its applications , 9(3):183–185, 1975.
3[3] Lenore Blum, Felipe Cucker, Michael Shub, and Stephen Smale. Complexity and real computation . Springer-Verlag, New York, 1998.
4[4] Michael Burr, Kisun Lee, and Anton Leykin. Effective certification of approximate solutions to systems of equations involving analytic functions. In Proceedings of the 2019 ACM on International Symposium on Symbolic and Algebraic Computation , pages 267–274. ACM, 2019.
5[5] J.-P. Dedieu and M. Shub. Newton’s method for overdetermined systems of equations. Math. Comp. , 69(231):1099–1115, 2000.
6[6] Michel Demazure. Sur deux problemes de reconstruction. Technical Report 882, INRIA, 1988.
7[7] Peter Deuflhard. Newton methods for nonlinear problems: affine invariance and adaptive algorithms , volume 35. Springer Science & Business Media, 2011.
8[8] William Fulton. Young tableaux . London Mathematical Society Students Texts, 35. Cambridge University Press, Cambridge, 1997.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Certification for Polynomial Systems

Abstract.

Key words and phrases:

2010 Mathematics Subject Classification:

1. Introduction

Remark 1.1**.**

2. Approximate solutions

Definition 2.1**.**

Remark 2.2**.**

Remark 2.3**.**

Proposition 2.4**.**

Proof.

Definition 2.5**.**

Proposition 2.6**.**

Proof.

Corollary 2.7**.**

Proof.

2.1. Smale’s α\alphaα-theory

Proposition 2.8** ([3, p. 160]).**

Proposition 2.9**.**

2.2. Other approaches

3. Certification via nonsolutions

3.1. Certification algorithms

Algorithm 1** (Certifying individual solutions).**

Remark 3.1**.**

Theorem 3.2**.**

Proof.

Algorithm 2** (Certifying a set of solutions).**

Theorem 3.3**.**

Proof.

Remark 3.4**.**

3.2. Newton-Okounkov bodies and Khovanskii bases

Definition 3.5**.**

Proposition 3.6** ([18, Thm. 4.9]).**

Definition 3.7**.**

Example 3.8**.**

Remark 3.9**.**

4. Certification via liaison pruning

Example 4.1**.**

Algorithm 3** (Certifying approximate solutions to a square system on a variety XXX).**

Remark 4.2**.**

Proof of correctness.

Algorithm 4** (Certifying solutions to a square system of varieties).**

Proof of correctness.

5. Examples

5.1. Plane quartics through four points

5.2. Example from Schubert calculus

5.3. Essential matrix estimation

Proposition 5.1**.**

Proof.

Remark 1.1.

Definition 2.1.

Remark 2.2.

Remark 2.3.

Proposition 2.4.

Definition 2.5.

Proposition 2.6.

Corollary 2.7.

2.1. Smale’s $\alpha$ -theory

Proposition 2.8 ([3, p. 160]).

Proposition 2.9.

Algorithm 1 (Certifying individual solutions).

Remark 3.1.

Theorem 3.2.

Algorithm 2 (Certifying a set of solutions).

Theorem 3.3.

Remark 3.4.

Definition 3.5.

Proposition 3.6 ([18, Thm. 4.9]).

Definition 3.7.

Example 3.8.

Remark 3.9.

Example 4.1.

Algorithm 3 (Certifying approximate solutions to a square system on a variety $X$ ).

Remark 4.2.

Algorithm 4 (Certifying solutions to a square system of varieties).

Proposition 5.1.