Error localization of best L1 polynomial approximants

Yuji Nakatsukasa; Alex Townsend

arXiv:1902.02664·math.NA·June 23, 2020·SIAM J. Numer. Anal.

Error localization of best L1 polynomial approximants

Yuji Nakatsukasa, Alex Townsend

PDF

Open Access

TL;DR

This paper establishes a connection between best L0 and L1 polynomial approximants for corrupted polynomials, demonstrating an error localization property and proposing an improved approximation algorithm.

Contribution

It introduces a continuous analogue of compressed sensing principles for polynomial approximation and develops an enhanced method for computing best L1 polynomial approximants.

Findings

01

Best L0 and L1 polynomial approximants are nearly equal for corrupted polynomials.

02

Error localization property of best L1 polynomial approximants is demonstrated.

03

An improved algorithm for computing best L1 polynomial approximants is proposed.

Abstract

An important observation in compressed sensing is that the $ℓ_{0}$ minimizer of an underdetermined linear system is equal to the $ℓ_{1}$ minimizer when there exists a sparse solution vector and a certain restricted isometry property holds. Here, we develop a continuous analogue of this observation and show that the best $L_{0}$ and $L_{1}$ polynomial approximants of a polynomial that is corrupted on a set of small measure are nearly equal. We go on to demonstrate an error localization property of best $L_{1}$ polynomial approximants and use our observations to develop an improved algorithm for computing best $L_{1}$ polynomial approximants to continuous functions.

Equations119

∥ f - p_{n}^{L_{1}} ∥_{1} = p \in P_{n} min ∥ f - p ∥_{1}, ∥ f - p ∥_{1} = \int_{- 1}^{1} ∣ f (x) - p (x) ∣ d x,

∥ f - p_{n}^{L_{1}} ∥_{1} = p \in P_{n} min ∥ f - p ∥_{1}, ∥ f - p ∥_{1} = \int_{- 1}^{1} ∣ f (x) - p (x) ∣ d x,

Ω_{n} = {x \in [- 1, 1] : ∣ f (x) - p_{n}^{L_{1}} (x) ∣ \geq \frac{1}{2} ∥ f - p_{n}^{L_{\infty}} ∥_{\infty}} .

Ω_{n} = {x \in [- 1, 1] : ∣ f (x) - p_{n}^{L_{1}} (x) ∣ \geq \frac{1}{2} ∥ f - p_{n}^{L_{\infty}} ∥_{\infty}} .

f (x) = g (x) + ω (x),

f (x) = g (x) + ω (x),

∥ f ∥_{\infty}

∥ f ∥_{\infty}

∥ f ∥_{2}

x_{j} = cos (\frac{( N + 1 - j ) π}{N + 2}), 0 \leq j \leq N .

x_{j} = cos (\frac{( N + 1 - j ) π}{N + 2}), 0 \leq j \leq N .

p_{n}^{cheb} (x) = j = 0 \sum n f (x_{j}) ℓ_{j} (x), ℓ_{j} (x) = \frac{\prod _{i = 0, i \neq = j}^{n} ( x - x _{i} )}{\prod _{i = 0, i \neq = j}^{n} ( x _{j} - x _{i} )},

p_{n}^{cheb} (x) = j = 0 \sum n f (x_{j}) ℓ_{j} (x), ℓ_{j} (x) = \frac{\prod _{i = 0, i \neq = j}^{n} ( x - x _{i} )}{\prod _{i = 0, i \neq = j}^{n} ( x _{j} - x _{i} )},

p_{n}^{L_{\infty}}

p_{n}^{L_{\infty}}

p_{n}^{L_{2}}

∥ f - q ∥_{ℓ_{0}} \leq ∥ f - p_{m} ∥_{ℓ_{0}} = k,

∥ f - q ∥_{ℓ_{0}} \leq ∥ f - p_{m} ∥_{ℓ_{0}} = k,

\underline{y} = = Φ U_{0} (x_{0}) U_{0} (x_{1}) ⋮ U_{0} (x_{N}) \dots \dots ⋱ \dots U_{n} (x_{0}) U_{n} (x_{1}) ⋮ U_{n} (x_{N}) c_{0} ⋮ c_{n} - f (x_{0}) f (x_{1}) ⋮ f (x_{N}), q_{n} (x) = i = 0 \sum n c_{i} U_{i} (x)

\underline{y} = = Φ U_{0} (x_{0}) U_{0} (x_{1}) ⋮ U_{0} (x_{N}) \dots \dots ⋱ \dots U_{n} (x_{0}) U_{n} (x_{1}) ⋮ U_{n} (x_{N}) c_{0} ⋮ c_{n} - f (x_{0}) f (x_{1}) ⋮ f (x_{N}), q_{n} (x) = i = 0 \sum n c_{i} U_{i} (x)

\underline{c} \in R^{n + 1} min ∥Φ \underline{c} - \underline{f} ∥_{ℓ_{0}}, \underline{f} = [f (x_{0}) \dots f (x_{N})]^{⊤},

\underline{c} \in R^{n + 1} min ∥Φ \underline{c} - \underline{f} ∥_{ℓ_{0}}, \underline{f} = [f (x_{0}) \dots f (x_{N})]^{⊤},

\underline{c} \in R^{n + 1} min ∥ D Φ \underline{c} - D \underline{f} ∥_{ℓ_{0}}, D = 2/ (N + 2) diag (1 - x_{0}^{2}, \dots, 1 - x_{N}^{2}) .

\underline{c} \in R^{n + 1} min ∥ D Φ \underline{c} - D \underline{f} ∥_{ℓ_{0}}, D = 2/ (N + 2) diag (1 - x_{0}^{2}, \dots, 1 - x_{N}^{2}) .

\underline{z} \in R^{N + 1} min ∥ \underline{z} ∥_{ℓ_{0}}, \mbox s u bj ec tt o V^{⊤} \underline{z} = - V^{⊤} D \underline{f},

\underline{z} \in R^{N + 1} min ∥ \underline{z} ∥_{ℓ_{0}}, \mbox s u bj ec tt o V^{⊤} \underline{z} = - V^{⊤} D \underline{f},

(1 - δ_{k}) ∥ \underline{x} ∥_{2}^{2} \leq ∥ A \underline{x} ∥_{2}^{2} \leq (1 + δ_{k}) ∥ \underline{x} ∥_{2}^{2}, ∥ \underline{x} ∥_{2}^{2} = i = 1 \sum r ∣ x_{i} ∣^{2},

(1 - δ_{k}) ∥ \underline{x} ∥_{2}^{2} \leq ∥ A \underline{x} ∥_{2}^{2} \leq (1 + δ_{k}) ∥ \underline{x} ∥_{2}^{2}, ∥ \underline{x} ∥_{2}^{2} = i = 1 \sum r ∣ x_{i} ∣^{2},

\underline{z} \in R^{N + 1} min ∥ \underline{z} ∥_{ℓ_{1}}, \mbox s u bj ec tt o V^{⊤} \underline{z} = - V^{⊤} D \underline{f} .

\underline{z} \in R^{N + 1} min ∥ \underline{z} ∥_{ℓ_{1}}, \mbox s u bj ec tt o V^{⊤} \underline{z} = - V^{⊤} D \underline{f} .

\underline{c} \in R^{n + 1} min ∥ D (Φ \underline{c} - \underline{f}) ∥_{ℓ_{1}},

\underline{c} \in R^{n + 1} min ∥ D (Φ \underline{c} - \underline{f}) ∥_{ℓ_{1}},

q \in P_{n} min ∥ f - q ∥_{ℓ_{1}} .

q \in P_{n} min ∥ f - q ∥_{ℓ_{1}} .

p_{n}^{ℓ_{1}} (x) = j = 0 \sum n c_{j}^{*} U_{j} (x)

p_{n}^{ℓ_{1}} (x) = j = 0 \sum n c_{j}^{*} U_{j} (x)

V = D U_{n + 1} (x_{0}) ⋮ U_{n + 1} (x_{N}) \dots ⋱ \dots U_{N} (x_{0}) ⋮ U_{N} (x_{N}) \in R^{(N + 1) \times (N - n)} .

V = D U_{n + 1} (x_{0}) ⋮ U_{n + 1} (x_{N}) \dots ⋱ \dots U_{N} (x_{0}) ⋮ U_{N} (x_{N}) \in R^{(N + 1) \times (N - n)} .

A^{⊤} D = [Φ^{⊤} D V^{⊤}],

A^{⊤} D = [Φ^{⊤} D V^{⊤}],

[Φ^{⊤} D V^{⊤}] \underline{z}_{2}^{2} = ∥ Φ^{⊤} D \underline{z} ∥_{2}^{2} + ∥ V^{⊤} \underline{z} ∥_{2}^{2} = ∥ \underline{z} ∥_{2}^{2}, \underline{z} \in C^{N + 1} .

[Φ^{⊤} D V^{⊤}] \underline{z}_{2}^{2} = ∥ Φ^{⊤} D \underline{z} ∥_{2}^{2} + ∥ V^{⊤} \underline{z} ∥_{2}^{2} = ∥ \underline{z} ∥_{2}^{2}, \underline{z} \in C^{N + 1} .

∥ Φ^{⊤} D \underline{z} ∥_{2}^{2} \leq \frac{2 ( n + 1 ) k}{N + 2} ∥ \underline{z} ∥_{2}^{2} .

∥ Φ^{⊤} D \underline{z} ∥_{2}^{2} \leq \frac{2 ( n + 1 ) k}{N + 2} ∥ \underline{z} ∥_{2}^{2} .

(1 - \frac{2 ( n + 1 ) k}{N + 2}) ∥ \underline{z} ∥_{2}^{2} \leq ∥ V^{⊤} \underline{z} ∥_{2}^{2} \leq ∥ \underline{z} ∥_{2}^{2}

(1 - \frac{2 ( n + 1 ) k}{N + 2}) ∥ \underline{z} ∥_{2}^{2} \leq ∥ V^{⊤} \underline{z} ∥_{2}^{2} \leq ∥ \underline{z} ∥_{2}^{2}

\int_{Ω_{s}} ∣ p (x) ∣ d x \leq \frac{s ( n + 1 ) ^{2}}{2} \int_{- 1}^{1} ∣ p (x) ∣ d x

\int_{Ω_{s}} ∣ p (x) ∣ d x \leq \frac{s ( n + 1 ) ^{2}}{2} \int_{- 1}^{1} ∣ p (x) ∣ d x

\int_{Ω_{s}} ∣ p (x) ∣ d x \leq \int_{[- 1, 1] ∖ Ω_{s}} ∣ p (x) ∣ d x, p \in P_{n},

\int_{Ω_{s}} ∣ p (x) ∣ d x \leq \int_{[- 1, 1] ∖ Ω_{s}} ∣ p (x) ∣ d x, p \in P_{n},

∥ f - p_{m} - δ p ∥_{1}

∥ f - p_{m} - δ p ∥_{1}

\geq \int_{Ω_{s}} ∣ f (x) - p_{m} (x) ∣ d x - \int_{Ω_{s}} ∣ δ p (x) ∣ d x + \int_{[- 1, 1] ∖ Ω_{s}} ∣ δ p (x) ∣ d x

\geq ∥ f - p_{m} ∥_{1},

∥ p_{n}^{L_{1}} - p_{n}^{*} ∥_{1} \leq \frac{4}{2 - s ( n + 1 ) ^{2}} ∥ f_{0} - p_{n}^{*} ∥_{1},

∥ p_{n}^{L_{1}} - p_{n}^{*} ∥_{1} \leq \frac{4}{2 - s ( n + 1 ) ^{2}} ∥ f_{0} - p_{n}^{*} ∥_{1},

∥ f - p_{n}^{*} - δ p ∥_{1}

∥ f - p_{n}^{*} - δ p ∥_{1}

\geq \int_{Ω_{s}} (∣ f (x) - p_{n}^{*} (x) ∣ - ∣ δ p (x) ∣) d x + \int_{[- 1, 1] ∖ Ω_{s}} (∣ δ p (x) ∣ - ∣ f (x) - p_{n}^{*} (x) ∣) d x

\geq ∥ f - p_{n}^{*} ∥_{1} - 2∥ f_{0} - p_{n}^{*} ∥_{1} + \int_{[- 1, 1] ∖ Ω_{s}} ∣ δ p (x) ∣ d x - \int_{Ω_{s}} ∣ δ p (x) ∣ d x,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Digital Filter Design and Implementation · Image and Signal Denoising Methods

Full text

Error localization of best L1 polynomial approximants††thanks: Submitted to the editors .

\fundingThe National Institute of Informatics in Tokyo partially funded an extended collaboration visit between the authors in December 2018, where the majority of this research took place. The first author is supported by the JSPS grants no. 17H01699 and 18H05837. The second author is supported by the National Science Foundation grant no. 1818757.

Yuji Nakatsukasa

Mathematical Institute, University of Oxford, Oxford, OX2 6GG, UK ([email protected]).

Alex Townsend

Department of Mathematics, Cornell University, Ithaca, NY 14853. ([email protected]).

Abstract

An important observation in compressed sensing is that the $\ell_{0}$ minimizer of an underdetermined linear system is equal to the $\ell_{1}$ minimizer when there exists a sparse solution vector and a certain restricted isometry property holds. Here, we develop a continuous analogue of this observation and show that the best $L_{0}$ and $L_{1}$ polynomial approximants of a polynomial that is corrupted on a set of small measure are nearly equal. We go on to demonstrate an error localization property of best $L_{1}$ polynomial approximants and use our observations to develop an improved algorithm for computing best $L_{1}$ polynomial approximants to continuous functions.

keywords:

polynomial approximation, best $L_{1}$ , compressed sensing, best $L_{0}$ , restricted isometry property, error localization

{AMS}

65F15, 15A18, 15A22

1 Introduction

In compressed sensing the $\ell_{0}$ minimizer of an underdetermined linear system $Ax=b$ can be exactly recovered by the $\ell_{1}$ minimizer when the $\ell_{0}$ minimizer is sufficiently sparse and $A$ satisfies some regularity conditions [11, 15, 18]. Similarly, when an acquired signal is sparsely corrupted, one can exactly recover the original signal by minimizing the $\ell_{1}$ error, under suitable assumptions [13]. In this paper, we investigate a continuous analogue of this phenomenon and show that the best $L_{0}$ and $L_{1}$ polynomial approximants of corrupted polynomials (see Definition 1.1) are equal, under suitable assumptions (see Section 2). We also make precise a related observation that the best $L_{1}$ error can be concentrated to intervals of small measure, showing that they can be advantageous compared to minimax approximants for certain applications (see [30]).

Let $f:[-1,1]\rightarrow\mathbb{R}$ be a continuous function and $n\geq 0$ an integer. The best $L_{1}$ polynomial approximant, $\smash{p_{n}^{L_{1}}\!}$ , of degree $\leq n$ to $f$ exists, is unique [27, Thm. 14.3], and satisfies

[TABLE]

where $\mathcal{P}_{n}$ is the space of polynomials of degree $\leq n$ . While the minimax approximant, $\smash{p_{n}^{L_{\infty}}\!}$ , is the best approximant in the sense that $\|f-\smash{p_{n}^{L_{\infty}}\!}\|_{\infty}=\min_{p\in\mathcal{P}_{n}}\|f-p\|_{\infty}$ , where $\|\cdot\|_{\infty}$ is the maximum norm, we know by the equioscillation theorem that the maximum deviation is attained $\geq n+2$ times [27, Thm. 7.2]. On the other hand, it can frequently be observed that $|f(x)-\smash{p_{n}^{L_{1}}\!}(x)|\ll\|f-\smash{p_{n}^{L_{\infty}}\!}\|_{\infty}$ for most, but not all, $x\in[-1,1]$ (see Fig. 1 and Section 4). To make this observation precise, we define the set111The constant of $1/2$ in the definition of $\Omega_{n}$ (see Eq. 2) is an arbitrary choice as any constant in $(0,1)$ would do, with very minor changes to the results that we derive.

[TABLE]

For any $x\in[-1,1]\setminus\Omega_{n}$ we know that $\smash{p_{n}^{L_{1}}\!}(x)$ is a better approximation to $f(x)$ than $\smash{p_{n}^{L_{\infty}}\!}(x)$ . By the definition of $\smash{p_{n}^{L_{\infty}}\!}$ , $\Omega_{n}$ is not the empty set, but we often observe that $|\Omega_{n}|\rightarrow 0$ as $n\rightarrow\infty$ (see Section 4). For example, in Section 4 we prove that $|\Omega_{n}|=\mathcal{O}(n^{-2}\log n)$ for $f(x)=\sqrt{1-x^{2}}$ and $|\Omega_{n}|=\mathcal{O}(n^{-1})$ for $f(x)=|x|$ . In such cases we say that the error $f-\smash{p_{n}^{L_{1}}\!}$ is “highly localized”. This property of best $L_{1}$ approximation seems to be underappreciated and is related to observations from compressed sensing.

The highly localized nature of $f-\smash{p_{n}^{L_{1}}\!}$ means that best $L_{1}$ polynomial approximation is ideal for recovering functions that have been arbitrarily corrupted on a set of small measure.

Definition 1.1.

For $0\leq s<1$ , we say that a function $f:[-1,1]\rightarrow\mathbb{R}$ is a $s$ -corrupted function if $f$ can be written as

[TABLE]

*where $g:[-1,1]\rightarrow\mathbb{R}$ is a continuous function, $\omega(x)$ is a measurable function with $|{\rm supp}(\omega)|\leq s$ , and $|{\rm supp}(\omega)|$ denotes the Lebesgue measure of the support of $\omega$ on $[-1,1]$ . Note that the support of $\omega$ , denoted by ${\rm supp}(\omega)$ , is a closed subset of $[-1,1]$ .

If $g=p_{m}$ is a polynomial of degree $\leq m$ in Definition 1.1, then we say that $f$ is a corrupted polynomial. If, in addition, $s<\min(1,1/(4n^{2}))$ for some integer $n\geq m$ , then one finds that the best $L_{1}$ polynomial approximant of degree $\leq n$ to $f$ is unique and $\smash{p_{n}^{L_{1}}\!}=p_{m}$ (see Corollary 2.6). This means that best $L_{1}$ approximation exactly recovers a corrupted polynomial with arbitrary corruption, provided that the corruption has small enough support.

Figure 2 illustrates the four regimes that one typically observes with best $L_{1}$ approximants of degree $\leq n$ of $f=p_{m}+\omega$ : (a) If $n<m$ , then $\smash{p_{n}^{L_{1}}\!}\neq p_{m}$ , but $\smash{p_{n}^{L_{1}}\!}$ is a near-best approximant to $p_{m}$ (see Section 3), (b) If $n$ is small and $n\geq m$ , then one gets exact recovery as $\smash{p_{n}^{L_{1}}\!}=p_{m}$ (see Corollary 2.6), (c) If $n$ is a little larger, then $\smash{p_{n}^{L_{1}}\!}$ tries to fit corruptions near $\pm 1$ but not the corruptions away from $\pm 1$ , and (d) When $n$ is large, $\smash{p_{n}^{L_{1}}\!}$ tries to fit all the corruption, resulting in an overfit.

We go on to derive an efficient algorithm for the recovery of $p_{m}$ from $f$ by showing that the continuous optimization problem in Eq. 1 for $\smash{p_{n}^{L_{1}}\!}$ can be reduced to a linear programming problem, provided that a sampling condition is satisfied (see Theorem 2.1). This observation results in a computationally efficient algorithm for the exact recovery of corrupted polynomials (see Section 2.3).

It is worth emphasizing that the Lebesgue measure of the support of the corruption must be extremely small. For example, our theory only guarantees that a corrupted polynomial of degree $100$ can be exactly recovered if it is corrupted on a set of measure $\leq 2.5\times 10^{-5}$ . Nevertheless, in practice, we observe that exact recovery is usually still possible when the corruption occurs on sets that have a much larger measure. Moreover, the distribution of the corruption in $[-1,1]$ does matter. In particular, larger regions of corruption are allowed away from $\pm 1$ and we present an initial result in this direction (see Theorem A.1). For example, when $n=100$ exact recovery is still guaranteed with any corruption interval of the form $[-s/2,s/2]$ with $s\leq 4\times 10^{-4}$ .

The error localization properties of best $L_{1}$ approximants lead to an iterative algorithm for computing $\smash{p_{n}^{L_{1}}\!}$ given a continuous function $f:[-1,1]\rightarrow\mathbb{R}$ , based on a combination of linear programming and Newton’s method (see Section 5). This can be seen as an improvement on Watson’s algorithm [20, 34]. Our algorithm allows for the zero set of $f-\smash{p_{n}^{L_{1}}\!}$ to have positive measure and heavily employs algorithmic advances over the last decade in polynomial rootfinding and adaptive Chebyshev interpolants [4, 25]. In particular, our implementation greatly benefits from the adaptive and robust algorithms for computing with functions in Chebfun.222Chebfun is an object-oriented software system written in MATLAB that provides an environment to compute with piecewise smooth functions [25]. It represents univariate functions defined on a finite interval by piecewise Chebyshev interpolants of adaptively selected degrees that are accurate to essentially machine precision [16]. It is able to accurately compute best $L_{1}$ approximants of degrees in the thousands (see Section 5).

In addition to the $L_{1}$ -norm (see Eq. 1), we also define the following for continuous functions $f:[-1,1]\rightarrow\mathbb{R}$ :

[TABLE]

where $w_{j}\geq 0$ are weights so that $\sum_{j=0}^{N}w_{j}\left|f(x_{j})\right|\rightarrow\int_{-1}^{1}|f(x)|dx$ as $N\rightarrow\infty$ . Despite the notation, $\|\cdot\|_{\ell_{0}}$ is not a norm. For completeness, we also define $\|f\|_{0}=|{\rm supp}(f)|$ as the Lebesgue measure of the support of $f$ . We always take $x_{0},\ldots,x_{N}$ in the discrete norms $\|f\|_{\ell_{1}}$ and $\|f\|_{\ell_{0}}$ to be the roots of the degree $N+1$ Chebyshev polynomial of the second kind $U_{N+1}$ [24, Tab. 18.3.1]. That is,

[TABLE]

Accordingly, we take $\smash{w_{j}=\pi\sqrt{1-x_{j}^{2}}/(N+2)}$ in Eq. 3 so that the corresponding quadrature rule is related to the Gauss–Chebyshev rule. The Chebyshev polynomials of the second kind and their roots in Eq. 4 play a special role in best $L_{1}$ approximation [27, Ch. 14]. In particular, when $N=n$ , the polynomial interpolant of $f$ at the points in Eq. 4, i.e.,

[TABLE]

is the best $L_{1}$ polynomial approximation of degree $\leq n$ to $f$ if $f-\smash{p_{n}^{{\rm cheb}}}$ has exactly $n+1$ distinct zeros in $[-1,1]$ [8, 26].

For an integer $n\geq 0$ , we denote by $\smash{p_{n}^{L_{\infty}}\!}$ , $\smash{p_{n}^{L_{2}}}$ , $\smash{p_{n}^{\ell_{1}}\!}$ , and $\smash{p_{n}^{\ell_{0}}\!}$ any best $L_{\infty}$ , $L_{2}$ , $\ell_{1}$ , and $\ell_{0}$ polynomial of degree $\leq n$ to $f$ , respectively. These polynomials are solutions to the following optimization problems:

[TABLE]

We also define $p_{n}^{L_{0}}=\arg\min_{q\in\mathcal{P}_{n}}\|f-q\|_{0}$ , when best polynomial in this sense exists.

The paper is structured as follows. In Section 2, we show that the exact recovery of an arbitrarily corrupted polynomial is possible provided that the support of the corruption has small enough measure. This leads to an efficient algorithm to achieve recovery. In Section 3, we extend these ideas to the near-recovery of corrupted smooth functions. In Section 4, we show that $|\Omega_{n}|$ is small precisely when $\|f-\smash{p_{n}^{L_{1}}\!}\|_{1}\rightarrow 0$ faster than $\|f-\smash{p_{n}^{L_{\infty}}\!}\|_{\infty}\rightarrow 0$ as $n\rightarrow\infty$ and carefully consider two worked examples with error localization. Finally, in Section 5, we present our iterative algorithm for computing best $L_{1}$ polynomial approximants of continuous functions.

2 Exact recovery of corrupted polynomials

In this section we suppose that $f:[-1,1]\rightarrow\mathbb{R}$ is formed by an arbitrarily corrupted polynomial, i.e., $f=p_{m}+\omega$ , where $p_{m}$ is a polynomial of degree $\leq m$ and $\omega$ is a function with small support. We investigate the question: When is it possible to exactly recover $p_{m}$ from knowledge of $f$ ?

We show that for corrupted polynomials, we have $\smash{p_{n}^{L_{0}}\!}=\smash{p_{n}^{\ell_{0}}\!}=\smash{p_{n}^{\ell_{1}}\!}=\smash{p_{n}^{L_{1}}\!}=p_{m}$ provided that the support of $\omega$ is sufficiently small, $n\geq m$ , and enough of the samples $x_{0},\ldots,x_{N}$ in Eq. 4 lie outside of the support of $\omega$ .

Theorem 2.1.

Let $f=p_{m}+\omega$ be a $s$ -corrupted polynomial of degree $\leq m$ . Then, the following statements hold when $n\geq m$ :

If $s<1$ , then $\smash{p_{n}^{L_{0}}\!}=p_{m}$ . 2. 2.

If $(N-n)/2$ , or fewer, of the samples $x_{0},\ldots,x_{N}$ are in ${\rm supp}(\omega)$ , then $\smash{p_{n}^{\ell_{0}}\!}=p_{m}$ . 3. 3.

If $k$ of $x_{0},\ldots,x_{N}$ are in ${\rm supp}(\omega)$ , $N+1>{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}6}(n+1)k-1$ , and $N\geq n$ , then $\smash{p_{n}^{\ell_{1}}\!}=p_{m}$ . 4. 4.

If $s<1/(n+1)^{2}$ , then $\smash{p_{n}^{L_{1}}\!}=p_{m}$ .

We prove the four statements in the theorem, in turn, in the next four subsections.

2.1 Exact recovery with best $\mathbf{L_{0}}$ approximation

Intuitively, recovery of a corrupted function is ideal for best $L_{0}$ polynomial approximation as the Lebesgue measure of ${\rm supp}(f-\smash{p_{n}^{L_{0}}\!})$ is minimized. The polynomial approximant $\smash{p_{n}^{L_{0}}\!}$ is so good at recovery that when $f=p_{m}+\omega$ we have $\smash{p_{n}^{L_{0}}\!}=p_{m}$ provided that ${\rm supp}(\omega)$ is less than half the interval and $n\geq m$ .

To see this, note that $|{\rm supp}(f-p_{m})|={\rm supp}(\omega)=s<1$ . Suppose there is a polynomial $q$ of degree $\leq n$ such that $|{\rm supp}(f-q)|\leq|{\rm supp}(f-p_{m})|$ . Then, $|{\rm supp}(q-p_{m})|\leq 2s<2$ so $q$ and $p_{m}$ must coincide on a set of positive measure in $[-1,1]$ . Since $q$ and $p_{m}$ are polynomials and $n\geq m$ , we have that $q=p_{m}$ . We conclude that $\smash{p_{n}^{L_{0}}\!}=p_{m}$ provided that ${\rm supp}(\omega)<1$ and $n\geq m$ . This proves the first statement of Theorem 2.1.

2.2 Exact recovery with best $\mathbf{\ell_{0}}$ approximation

It can be algorithmically challenging to compute $\smash{p_{n}^{L_{0}}\!}$ and it is reasonable to attempt recovery from $\smash{p_{n}^{\ell_{0}}\!}$ instead, which involves a discrete optimization problem. The polynomial approximant $\smash{p_{n}^{\ell_{0}}\!}$ is also ideal at recovering polynomials under the mild assumption that enough of the samples $x_{0},\ldots,x_{N}$ (see Eq. 4) lie outside of ${\rm supp}(\omega)$ .

To see this, suppose that $f=p_{m}+\omega$ and there is a polynomial $q$ of degree $\leq n$ such that

[TABLE]

where $n\geq m$ and $k$ is the number of samples $x_{0},\ldots,x_{N}$ in ${\rm supp}(\omega)$ . If $k\leq(N-n)/2$ , then $q-p_{m}\in\mathcal{P}_{n}$ is zero on at least $N+1-2k\geq n+1$ distinct points and hence $q=p_{m}$ [27, p. 34]. By definition of $\smash{p_{n}^{\ell_{0}}\!}$ , we must have $\smash{p_{n}^{\ell_{0}}\!}=p_{m}$ . This proves the second statement of Theorem 2.1.

2.3 Exact recovery with best $\mathbf{\ell_{1}}$ approximation

The polynomial $\smash{p_{n}^{\ell_{0}}\!}$ can be computationally prohibitive to compute if $m$ is large. Fortunately, by using the restricted isometry property (RIP) from compressed sensing, one finds that $\smash{p_{n}^{\ell_{0}}\!}=\smash{p_{n}^{\ell_{1}}\!}$ when an oversampling condition is satisfied, along with some regularity assumptions. This means that $\smash{p_{n}^{\ell_{1}}\!}$ , which can be computed efficiently, can often be used for exact recovery [3].

First, we know that $\|f-q_{n}\|_{\ell_{0}}=k$ for $q_{n}\in\mathcal{P}_{n}$ is equivalent to a vector $\underline{y}$ having precisely $k$ nonzero entries, where

[TABLE]

and $U_{i}(x)$ is the Chebyshev polynomial of the second kind of degree $i$ [24, Tab. 18.3.1]. The problem of minimizing $\|\underline{y}\|_{\ell_{0}}$ over $\mathcal{P}_{n}$ in Eq. 8 is solved by $\smash{p_{n}^{\ell_{0}}\!}$ and can be written as

[TABLE]

which is equivalent to the following diagonally-scaled problem:

[TABLE]

By a technique described in [13, p. 4204], if $V\in\mathbb{R}^{(N+1)\times(N-n)}$ is a matrix whose columns form a basis for the left null space of $D\Phi$ so that $V^{\top}(D\Phi)=0$ , then Eq. 9 is also a constrained $\ell_{0}$ minimization problem:

[TABLE]

where $\underline{z}=D\Phi\underline{c}-D\underline{f}$ . This problem is precisely the task of interest in the compressed sensing literature with a short-fat matrix $V^{\top}$ and an unknown sparse vector $\underline{z}$ .

The $\ell_{0}$ minimization problem (11) is known to be NP-hard [18, Sec. 2.3]. A practical remedy is to replace the $\ell_{0}$ norm with the $\ell_{1}$ norm. To understand when this gives the solution to the $\ell_{0}$ problem, an important concept in compressed sensing is the RIP. We say that a matrix $A\in\mathbb{C}^{m\times r}$ satisfies the RIP if there exists a constant $0<\delta_{k}<1$ such that

[TABLE]

for every vector $\underline{x}\in\mathbb{C}^{r}$ that has at most $k$ nonzero entries [13]. It is known that if $V^{\top}$ satisfies the RIP with $\delta_{k}<\frac{1}{3}$ , then the solution to Eq. 11 is exactly recovered (under the assumption that the $\ell_{0}$ -minimizer $\leq k$ nonzero entries) by solving the $\ell_{1}$ minimization problem [9]

[TABLE]

Here, Eq. 13 can be efficiently solved as a basis pursuit problem via the spectral projected-gradient $L_{1}$ (SPGL1) algorithm [32]; especially, since there is a fast matrix-vector product for $V^{\top}$ based on the discrete sine transformation (see Eq. 16).

Note that unlike $\|f\|_{\ell_{1}}$ in Eq. 3 for functions, the $\ell_{1}$ norm for vectors is simply the sum of the absolute values of the vector entries. The problem in Eq. 13 is equivalent to

[TABLE]

which in turn can be written as (recalling Eq. 3) the best $\ell_{1}$ approximation problem:

[TABLE]

We conclude that if the matrix $V^{\top}$ satisfies the RIP with $\delta_{k}<\frac{1}{3}$ then we have $\smash{p_{n}^{\ell_{1}}\!}=\smash{p_{n}^{\ell_{0}}\!}$ , where

[TABLE]

and the vector $\underline{c}^{*}$ is the solution to Eq. 14.

We are left with the task of studying when the matrix $V^{\top}$ in Eq. 11 satisfies the RIP with $\delta_{k}<\frac{1}{3}$ . For the samples $x_{0},\ldots,x_{N}$ that are given in Eq. 4, we have the discrete orthogonality condition $\smash{\sum_{\ell=0}^{N}U_{i}(x_{\ell})U_{j}(x_{\ell})(1-x_{\ell}^{2})=0}$ for $i\neq j$ [23, Sec. 4.6.1] so that we can write down an explicit basis for the left null space of $D\Phi$ in Eq. 8. That is,

[TABLE]

It turns out that due to the choice of the diagonal matrix $D$ in Eq. 10, the matrix $V$ in Eq. 16 is formed from a subset of columns of an orthogonal matrix. Furthermore, the size of $V^{\top}$ need not be extremely short-fat, as often required in compressed sensing. It is therefore possible to show that $V^{\top}$ satisfies the RIP under a mild oversampling condition.

Proposition 2.2.

*If $N+2>2(n+1)k$ for some integer $k\geq 1$ , then $V^{\top}$ in Eq. 16 satisfies the RIP with $\delta_{k}=(2(n+1)/(N+2))k$ . *

Proof 2.3.

Let $A$ be the $(N+1)\times(N+1)$ Chebyshev–Vandermonde matrix, i.e., $A_{ij}=U_{j}(x_{i})$ for $0\leq i,j\leq N$ , where $x_{i}$ is given in Eq. 4. Let $D$ be a diagonal matrix with $D_{i,i}=\sqrt{2/(N+2)}\sqrt{1-x_{i}^{2}}$ for $0\leq i\leq N$ . By the discrete orthogonality properties of Chebyshev polynomials of the second kind [23, Sec. 4.6.1], $DA$ is an orthogonal matrix with

[TABLE]

where $\Phi$ and $V$ are given in Eq. 8 and Eq. 16, respectively. Since $A^{\top}D$ has orthonormal columns, we find that

[TABLE]

Since $\sqrt{1-x^{2}}|U_{i}(x)|\leq 1$ for $x\in[-1,1]$ [24, (18.14.7)], each entry of $A^{\top}D$ has absolute value $\leq\sqrt{2/(N+2)}$ it follows by Cauchy–Schwarz that each entry of $\Phi^{\top}D\underline{z}$ is bounded by $\sqrt{\frac{2k}{N+2}}$ where $k$ is the number of nonzero entries in $\underline{z}$ , so we have

[TABLE]

Therefore, from Eq. 17 and the trivial bound of $\|V^{\top}\underline{z}\|_{2}^{2}\leq\|\underline{z}\|_{2}^{2}$ , we conclude that

[TABLE]

*for any vector $\underline{z}\in\mathbb{C}^{N+1}$ with at most $k$ nonzero entries. The statement immediately follows from the definition of the RIP (see Eq. 12). *

Proposition 2.2 tells us that $V^{\top}$ in Eq. 16 satisfies the RIP with $\delta_{k}<1/3$ if $N+1>6(n+1)k-1$ . Since $k$ is the number of samples $x_{0},\ldots,x_{N}$ that lie in ${\rm supp}(\omega)$ , it means that $\smash{p_{n}^{\ell_{0}}\!}=\smash{p_{n}^{\ell_{1}}\!}$ provided that the discrete problem is sufficiently oversampled. Since $k<(N+2)/(6(n+1))$ implies that $k\leq N-n$ when $k\geq 1$ and when $k=0$ we need $N\geq n$ , we conclude from Section 2.2 that if $N+1>6(n+1)k-1$ and $N\geq n$ , then $\smash{p_{n}^{\ell_{1}}\!}=\smash{p_{n}^{\ell_{0}}\!}=p_{m}$ when $n\geq m$ . This proves the third statement of Theorem 2.1.

The polynomial $\smash{p_{n}^{\ell_{1}}\!}$ can be computed by solving the basis pursuit problem in Eq. 13. This means that Proposition 2.2 gives us a practical and efficient algorithm for the exact recovery of corrupted polynomials with degrees in the thousands. Often it is the case that one does not know the degree of the corrupted polynomial or $k$ . Since the oversampling condition $N+1>6(n+1)k-1$ penalizes taking unnecessarily large $n$ , we recommend slowly increasing $n$ , computing the error $f-\smash{p_{n}^{\ell_{1}}\!}$ , and stopping at the smallest $n$ for which ${\rm supp}(f-\smash{p_{n}^{\ell_{1}}\!})<2$ .

2.4 Exact recovery with best $\mathbf{L_{1}}$ approximation

To begin to highlight the importance of error localization of best $L_{1}$ polynomial approximants, we now show that $\smash{p_{n}^{L_{1}}\!}$ can also be used for exact recovery of corrupted polynomials when the corruption has sufficiently small support. One can achieve this by demonstrating that a polynomial of degree $\leq n$ is not too concentrated in any small subset of $[-1,1]$ .

Lemma 2.4.

Let $\Omega_{s}\subseteq[-1,1]$ be a set of Lebesgue measure $s\geq 0$ . For any $n\geq 0$ , we have

[TABLE]

*for any polynomial $p$ of degree $\leq n$ . *

Proof 2.5.

*This statement is proved in [5, Sec. 4.2, Exercise 6]. *

Lemma 2.4 tells us that polynomials of degree $\leq n$ cannot be too localized in a set of small measure. In particular, if $0\leq|\Omega_{s}|<1/(n+1)^{2}$ , then

[TABLE]

with equality if and only if $p$ is the zero polynomial. A consequence of Eq. 20 is that a corrupted polynomial can be exactly recovered by best $L_{1}$ polynomial approximation.

Corollary 2.6.

*Let $f=p_{m}+\omega$ be a $s$ -corrupted polynomial of degree $\leq m$ on $[-1,1]$ . Then, the best $L_{1}$ polynomial approximant of degree $\leq n$ to $f$ is $p_{m}$ if $n\geq m$ and $s<1/(n+1)^{2}$ . *

Proof 2.7.

Let $\delta p\in\mathcal{P}_{n}$ and let $\Omega_{s}\subset[-1,1]$ be the support of $\omega$ . Since $[-1,1]=\Omega_{s}\cup([-1,1]\setminus\Omega_{s})$ , we have by the triangle inequality

[TABLE]

*where the last inequality follows from Eq. 20 as well as the fact that $f(x)-p_{m}(x)=0$ for $x\in[-1,1]\setminus\Omega_{s}$ . An equality holds in Eq. 21 if and only if $\delta p=0$ . We conclude that $p_{m}$ is the unique best $L_{1}$ polynomial approximant to $f$ of degree $\leq n$ . *

This proves the fourth and final statement of Theorem 2.1 and explains regime (b) in Fig. 2. It tells us that if a polynomial is corrupted on a subset of $[-1,1]$ that has small enough Lebesgue measure, then the best $L_{1}$ polynomial approximant exactly recovers the polynomial. Figure 3 illustrates Corollary 2.6 for the corrupted polynomial $f=T_{5}+\omega$ , where $T_{5}$ is the degree $5$ Chebyshev polynomial of the first kind and ${\rm supp}(\omega)=[-.7,-.67]\cup[.9,.903]$ . Using the fact that $\smash{p_{n}^{L_{1}}\!}=\smash{p_{n}^{\ell_{1}}\!}$ , one can efficiently recover $T_{5}$ to within essentially machine precision. Numerically, we find that $\|\smash{p_{n}^{L_{1}}\!}-T_{5}\|_{\infty}\approx 1.22\times 10^{-15}$ .

To highlight the importance of the $L_{1}$ -norm for Corollary 2.6, we consider the best polynomial approximants of degree $\leq 5$ to $f$ in the $L_{2}$ - and $L_{\infty}$ -norm (see Fig. 3 (right)). One finds that any corruption of arbitrarily small support prevents the best $L_{2}$ and $L_{\infty}$ polynomial approximants from recovering the uncorrupted polynomial.

The bound on $s$ of $s<1/(n+1)^{2}$ in Corollary 2.6 is probably not sharp. Though, we know that it cannot be increased above $\pi^{2}/(2(n+2)^{2})$ [5, Sec. 4.2]. This means that the algebraic scaling with respect to $n$ is definitive. In Appendix A, we extend Corollary 2.6 by demonstrating that the location of the support of the corruption in $[-1,1]$ is important, and more is allowed provided that the corruption occurs away from $\pm 1$ .

For concreteness, we have assumed that the sample points are the Chebyshev points given in Eq. 4. This choice is recommended when the samples can be taken at arbitrary points in $[-1,1]$ . However, in some cases, the sample points may be given a priori and cannot be chosen. Most of our results carry over to such cases with minor modifications and assumptions on the distribution of sample points.

3 Near-recovery of corrupted smooth functions

When recovering a corrupted polynomial $f=p_{m}+\omega$ , the degree of $p_{m}$ is usually unknown so we compute best $L_{1}$ polynomial approximants to $f$ of degree $\leq n$ for a slowly increasing sequence of $n$ , stopping when ${\rm supp}(f-\smash{p_{n}^{L_{1}}\!})<2$ . For the majority of this process $n<m$ and one may wonder what $\smash{p_{n}^{L_{1}}\!}$ is achieving in this regime (see Fig. 2 (a)). Similarly, if $f$ is a corrupted smooth function $f=f_{0}+\omega$ , where $f_{0}$ is a continuous function (not necessarily a polynomial) on $[-1,1]$ , then one cannot hope for exact recovery using best $L_{1}$ polynomial approximation. Instead, we find that $\smash{p_{n}^{L_{1}}\!}$ delivers a near-recovery of $f_{0}$ in the sense that $\smash{p_{n}^{L_{1}}\!}$ is a near-best $L_{1}$ approximation to $f_{0}$ , provided that the support of the corruption is small and $f_{0}$ can be well-approximated by a degree $\leq n$ polynomial. We first show that the best $L_{1}$ approximations for $f$ and $f_{0}$ are relatively close to each other.

Theorem 3.1.

Let $f=f_{0}+\omega$ be a $s$ -corrupted function on $[-1,1]$ , where $f_{0}:[-1,1]\rightarrow\mathbb{R}$ is continuous, and $\smash{p_{n}^{L_{1}}\!}$ be a best $L_{1}$ polynomial approximant of degree $\leq n$ to $f$ . If $s<1/(n+1)^{2}$ , then

[TABLE]

*where $p_{n}^{*}$ is the best $L_{1}$ approximant of degree $\leq n$ to $f_{0}$ on $[-1,1]$ .

Proof 3.2.

Let $\delta p\in\mathcal{P}_{n}$ and $\Omega_{s}={\rm supp}(\omega)$ . Since $[-1,1]=\Omega_{s}\cup([-1,1]\setminus\Omega_{s})$ , and by the triangle inequality, we have

[TABLE]

where the last inequality holds since $[-1,1]=\Omega_{s}\cup([-1,1]\setminus\Omega_{s})$ and $f(x)=f_{0}(x)$ for $x\not\in\Omega_{s}$ . From Eq. 19, we find that

[TABLE]

Hence, for any $\delta p\in\mathcal{P}_{n}$ we have the inequality

[TABLE]

Finally, by setting $\delta p=\smash{p_{n}^{L_{1}}\!}-p_{n}^{*}$ and noting that $\|f-\smash{p_{n}^{L_{1}}\!}\|_{1}\leq\|f-p_{n}^{*}\|_{1}$ we conclude that

[TABLE]

*The result follows by rearranging this inequality. *

Theorem 3.1 shows that best $L_{1}$ polynomial approximation is useful for near-recovery of a corrupted smooth function. More precisely, when $s<1/(n+1)^{2}$ we have

[TABLE]

and we conclude that a best $L_{1}$ approximant of $f$ recovers $f_{0}$ as best it can, up to a factor that depends on $n$ and $s$ .

The inequality in Eq. 22 also partially explains regime (a) in Fig. 2. It provides theoretical justification that $p_{5}^{L_{1}}\!$ is a near-best polynomial approximant to $P_{8}$ in Fig. 2. For the example in Fig. 2, we observe this near-recovery phenomenon since

[TABLE]

where $p_{5}^{L_{1}}\!$ is the best $L_{1}$ approximant of degree $\leq 5$ to the corrupted function.

Unlike corrupted polynomials (see Section 2), $f_{0}$ cannot be exactly recovered by $\smash{p_{n}^{\ell_{1}}\!}$ . Nonetheless, we find that $\smash{p_{n}^{\ell_{1}}\!}$ is often still a near-best approximant to $f_{0}$ , i.e., $\smash{p_{n}^{\ell_{1}}\!}\approx p_{n}^{*}$ . By interpreting $f_{0}-\smash{p_{n}^{\ell_{1}}\!}$ as noise, we observe that $\ell_{1}$ minimization gives a stable signal recovery in the presence of noise, a phenomenon that is appreciated in the classical compressed sensing context [12]. Making this observation precise in our setting is left as an open problem. Since by Theorem 3.1 we also have $\smash{p_{n}^{L_{1}}\!}\approx p_{n}^{*}$ , it follows that $\smash{p_{n}^{\ell_{1}}\!}\approx\smash{p_{n}^{L_{1}}\!}$ and $\smash{p_{n}^{\ell_{1}}\!}$ is an excellent initial guess for Newton’s method for computing $\smash{p_{n}^{L_{1}}\!}$ (see Section 5.3).

3.1 Related studies

The contents of Sections 2 and 3 can be regarded as contributions in compressed sensing, and a number of related studies are available in the literature. For (exact and near-exact) recovery of corrupted functions with $\ell_{1}$ minimization, examples include the paper by Adcock, Brugiapaglia and Webster [29], and Shin and Xiu [1]. Unlike this work, these papers consider recovering high-dimensional functions, describing probabilistic methods by taking random samples. Here we focus on univariate polynomials and reveal connections between $L_{0},L_{1},\ell_{0}$ and $\ell_{1}$ minimizers, and derive a deterministic recovery algorithm (under assumptions on the size of sampled corruption $k$ ) with $\ell_{1}$ minimization. Few of the results in this paper appear to be trivially generalizable to the higher-dimensional setting; this is left as an interesting open problem.

In the more classical setting of recovering a discrete signal (rather than a function) from a corrupted vector of observations, numerous contributions are available in the literature. See for example [10, 13, 21, 35] and the references therein. Ideas in compressed sensing have also been applied for general high-dimensional function approximation [2, 14].

4 Error localization of best $\mathbf{L_{1}}$ polynomial approximants

In Sections 2 and 3 we saw that $\smash{p_{n}^{L_{1}}\!}$ can be used for recovering corrupted polynomials and smooth functions. This is fundamentally due to the error localization properties of best $L_{1}$ polynomial approximation. The error localization properties of $\smash{p_{n}^{L_{1}}\!}$ are also important when approximating continuous functions $f:[-1,1]\rightarrow\mathbb{R}$ that one might not necessarily view as corrupted functions. We observe that continuous functions with singularities often have $|f(x)-\smash{p_{n}^{L_{1}}\!}(x)|\ll\|f-\smash{p_{n}^{L_{\infty}}\!}\|_{\infty}$ for most $x\in[-1,1]$ .

To make this precise, recall the definition of $\Omega_{n}$ in Eq. 2. By definition of $\Omega_{n}$ , we find that $\|f-\smash{p_{n}^{L_{1}}\!}\|_{1}\geq\tfrac{|\Omega_{n}|}{2}\|f-\smash{p_{n}^{L_{\infty}}\!}\|_{\infty}$ and thus,

[TABLE]

Therefore, the measure of $\Omega_{n}$ is bounded above by the disparity between the magnitude of $\|f-\smash{p_{n}^{L_{1}}\!}\|_{1}$ and $\|f-\smash{p_{n}^{L_{\infty}}\!}\|_{\infty}$ . If $\|f-\smash{p_{n}^{L_{1}}\!}\|_{1}\rightarrow 0$ asymptotically faster than $\|f-\smash{p_{n}^{L_{\infty}}\!}\|_{\infty}\rightarrow 0$ as $n\rightarrow\infty$ , then the error $f(x)-\smash{p_{n}^{L_{1}}\!}(x)$ must be highly localized for sufficiently large $n$ . An upper bound on $|\Omega_{n}|$ follows from an upper bound on $\|f-\smash{p_{n}^{L_{1}}\!}\|_{1}$ and a lower bound on $\|f-\smash{p_{n}^{L_{\infty}}\!}\|_{\infty}$ .

4.1 Error localization of best $\mathbf{L_{1}}$ approximants to $\mathbf{\sqrt{1-x^{2}}}$

Consider the function $f(x)=\sqrt{1-x^{2}}$ , which is continuous on $[-1,1]$ with square root singularities at $\pm 1$ . Here, we show that $|\Omega_{n}|=\mathcal{O}(n^{-2}\log n)$ proving that $\smash{p_{n}^{L_{1}}\!}(x)$ is a better pointwise estimate to $f(x)$ than $\smash{p_{n}^{L_{\infty}}\!}(x)$ for all $x\in[-1,1]$ except for a set of measure $\mathcal{O}(n^{-2}\log n)$ .

By [17, Lem. 4], we know that when $n$ is an even integer we have $\smash{p_{n}^{L_{1}}\!}=\smash{p_{n}^{{\rm cheb}}}$ for $\sqrt{1-x^{2}}$ , where $\smash{p_{n}^{{\rm cheb}}}$ is the degree $n$ Chebyshev interpolant of $\sqrt{1-x^{2}}$ (see Eq. 5). This allows us to derive an explicit expression for $\|f-\smash{p_{n}^{L_{1}}\!}\|_{1}$ by using an explicit formula for $\|f-\smash{p_{n}^{{\rm cheb}}}\|_{1}$ [8]. By applying the formula in [8] to $\sqrt{1-x^{2}}$ , we find that

[TABLE]

Here, the values of $b_{j}$ are derived as the expansion coefficients of $\sqrt{1-x^{2}}$ in a Chebyshev series of the second kind. That is,

[TABLE]

Since $|b_{j}|\leq 16(j+1)^{-3}/\pi$ for $j>0$ , we can bound $\|f-\smash{p_{n}^{L_{1}}\!}\|_{1}$ by

[TABLE]

where the last inequality uses the crude bounds of $\sum_{\nu=0}^{\infty}(2\nu+1)^{-1}(\nu+1)^{-3}\leq 2$ and $n+2\geq n+1$ .

We now seek a lower bound on $\|f-\smash{p_{n}^{L_{\infty}}\!}\|_{\infty}$ . Let $p_{n}^{\rm proj}(x)=\sum_{j=0}^{n}a_{j}T_{j}(x)$ be the Chebyshev expansion of the first kind for $\sqrt{1-x^{2}}$ that is truncated after $n+1$ terms. The values of $a_{j}$ are simple to calculate: $a_{2j-1}=0$ for all integers $j$ , and

[TABLE]

Assuming $n$ is an even integer, we find that

[TABLE]

Thus, $\|f-p_{n}^{\rm proj}\|_{\infty}\geq 2/(\pi(n+1))$ for an even integer $n$ . By [22, Cor. 4.1], we know that

[TABLE]

We conclude from Eq. 23 that for $f(x)=\sqrt{1-x^{2}}$ we have

[TABLE]

where the final equality holds since it is known that $\sigma_{n}\sim 4\pi^{-2}\log n$ [22, Eq. 20].

Figure 4 (left) shows the error $|f(x)-\smash{p_{n}^{L_{1}}\!}(x)|$ for $x\in[0,1)$ demonstrating that it is localized near $x=\pm 1$ . The measure of $|\Omega_{n}|$ is shown in Fig. 4 (right) where it is numerically observed that $|\Omega_{n}|=\mathcal{O}(n^{-2})$ . When $n=1000$ , we find that $\smash{|f(x)-\smash{p_{n}^{L_{1}}\!}(x)|<\tfrac{1}{2}\|f-\smash{p_{n}^{L_{\infty}}\!}\|_{\infty}}$ for all $x\in[-1,1]$ except for a set of measure $<10^{-5}$ .

4.2 Error localization of best $\mathbf{L_{1}}$ approximants to $\mathbf{|x|}$

As a second example of error localization, consider $f(x)=|x|$ on $[-1,1]$ , which is continuously differentiable except at $x=0$ . The error formula for $\|f-\smash{p_{n}^{L_{1}}\!}\|_{1}$ with $f(x)=|x|$ is calculated in [8] and simplifies to

[TABLE]

Moreover, it is known that $\|f-\smash{p_{n}^{L_{\infty}}\!}\|_{\infty}\!\sim\frac{\beta}{2n}$ for some $0.28016<\beta<0.28018$ [33]. We conclude from Eq. 23 that $|\Omega_{n}|\lesssim\frac{\pi^{2}}{\beta n}$ as $n\rightarrow\infty$ . Figure 5 (left) shows the error $|f(x)-\smash{p_{n}^{L_{1}}\!}(x)|$ for $x\in[0,1)$ demonstrating that it is highly localized and Fig. 5 numerically confirms that $|\Omega_{n}|=\mathcal{O}(n^{-1})$ .

5 A globally convergent algorithm for computing best $\mathbf{L_{1}}$ polynomial approximants

We now turn to the algorithmic aspects of computing $\smash{p_{n}^{L_{1}}\!}$ . We integrate our findings on exact recovery of corrupted polynomials and error localization into Watson’s algorithm based on Newton’s method [34]. An algorithm to compute best $L_{1}$ approximants with degrees in the thousands is developed based on recent advances in approximation theory such as stable polynomial interpolation, fast domain subdivision, and robust rootfinding implemented in Chebfun [16]. Figure 6 gives an overview of our algorithm.

5.1 Initial attempt: The Chebyshev interpolant

The polynomial interpolant $\smash{p_{n}^{{\rm cheb}}}$ in Eq. 5 with $N=n$ can be computed in $\mathcal{O}(n\log n)$ operations [19] and the roots of $f-\smash{p_{n}^{{\rm cheb}}}$ on $[-1,1]$ can be computed efficiently when $f$ is a smooth function [7]. Since $\smash{p_{n}^{L_{1}}\!}=\smash{p_{n}^{{\rm cheb}}}$ when $f-\smash{p_{n}^{{\rm cheb}}}$ has exactly $n+1$ roots in $[-1,1]$ [26], we recommend that Eq. 5 is always computed to see if $\smash{p_{n}^{L_{1}}\!}=\smash{p_{n}^{{\rm cheb}}}$ . When it is, $\smash{p_{n}^{L_{1}}\!}$ is efficient to compute and from practical experience it is relatively common for $\smash{p_{n}^{L_{1}}\!}=\smash{p_{n}^{{\rm cheb}}}$ (for example, see [17, Lem. 4]). This can happen also when $f$ is a corrupted polynomial.

5.2 Test for corrupted polynomials and initial guess: Compute $\mathbf{\ell_{1}}$ minimizer

When $f-\smash{p_{n}^{{\rm cheb}}}$ has $>n+1$ zeros in $[-1,1]$ , computing $\smash{p_{n}^{L_{1}}\!}$ is more involved and, in general, requires an iterative procedure. In this case, we first solve the discrete $\ell_{1}$ problem in Eq. 15 to obtain $\smash{p_{n}^{\ell_{1}}\!}$ . This has two purposes: (i) If $f$ is a corrupted polynomial $f=p_{m}+\omega$ (see Section 2), then $\smash{p_{n}^{\ell_{1}}\!}=p_{m}=\smash{p_{n}^{L_{1}}\!}$ , and (ii) If $f$ is not a corrupted polynomial, then $\smash{p_{n}^{\ell_{1}}\!}\approx\smash{p_{n}^{L_{1}}\!}$ [28, Thm. 3.9], which is then used as the initial guess for Newton’s method (see Section 5.3).

Specifically, we solve the LP in Eq. 25 with a large number of samples $N+1$ , taking $x_{0},\ldots,x_{N}$ and $\smash{w_{j}=\pi\sqrt{1-x_{j}^{2}}/(N+2)}$ as in Eq. 4. In our implementation we select $N+1=\max(1000+50n,5000)$ . (This is an engineering choice that assumes the corruption $k$ is small.) Recall from Theorem 2.1 that we want $N+1>6(n+1)k-1$ .) The maximum value 5000 is set to keep the LP size $2(N+1)+n+1$ manageable.

Once $\smash{p_{n}^{\ell_{1}}\!}$ is computed, we check whether $f$ is a corrupted polynomial. This can be done by testing if $f(x_{j})=\smash{p_{n}^{\ell_{1}}\!}(x_{j})$ holds at most of the sample points to within working precision. If not, then we improve the estimate $\smash{p_{n}^{\ell_{1}}\!}\approx\smash{p_{n}^{L_{1}}\!}$ by refining the LP mesh, and then proceed to Newton’s method.

5.2.1 Refinement: Reducing the discretization error

Underlying the minimization problem Eq. 13 is an approximate integration of a non-differentiable function. Specifically,

[TABLE]

Since $|f(x)-p_{n}(x)|$ is expected to be continuous, but non-differentiable at $\geq n+2$ points, one expects the integration error in Eq. 24 to be large and there is little benefit from using a high-order quadrature rules. Indeed using $N+1$ sample points, we find that the LP solution has accuracy $\|\smash{p_{n}^{\ell_{1}}\!}-\smash{p_{n}^{L_{1}}\!}\|_{1}=\mathcal{O}(N^{-1})$ , whether a high-order method (e.g. Clenshaw-Curtis) or a low-order method (such as the midpoint rule) is used. In more detail, the quadrature error in Eq. 24 is $\mathcal{O}(N^{-2})$ , so the objective function value $\|f-\smash{p_{n}^{\ell_{1}}\!}\|_{1}$ is within $\mathcal{O}(N^{-2})$ of optimal: $\|f-\smash{p_{n}^{\ell_{1}}\!}\|_{1}=\|f-\smash{p_{n}^{L_{1}}\!}\|_{1}+\mathcal{O}(N^{-2})$ . However, this only implies $\|\smash{p_{n}^{L_{1}}\!}-\smash{p_{n}^{\ell_{1}}\!}\|_{1}=\mathcal{O}(N^{-1})$ , which is a common phenomenon in optimization: at a global (or local) minimum, an $\epsilon$ -perturbation in the solution results in $O(\epsilon^{2})$ perturbation in the objective value. This low accuracy of $\smash{p_{n}^{\ell_{1}}\!}$ can cause convergence issues for Newton’s method, when it is used as an initial guess.

To improve the discretization error in Eq. 24, we follow a three-step procedure: (1) We use the initial LP solution with $N$ points to obtain an $\mathcal{O}(N^{-1})$ approximation to $\smash{p_{n}^{L_{1}}\!}$ , which we denote by $\tilde{p}_{n}$ . (2) The roots $\{r_{i}\}_{i=1}^{K}$ of $f-\tilde{p}_{n}$ in $[-1,1]$ are computed, which we expect to be $\mathcal{O}(N^{-1})$ approximations to the roots of $f-\smash{p_{n}^{L_{1}}\!}$ . Finally, (3) we solve another LP to obtain $\smash{p_{n}^{\ell_{1}}\!}$ , which is a better approximant to $\smash{p_{n}^{L_{1}}\!}$ than $\tilde{p}_{n}$ , with a discretization scheme that forms a finer mesh near the roots: We take $\approx N/2$ points on $\cup_{i=1}^{K}[r_{i}-\delta,r_{i}+\delta]$ , where $\delta=4/N$ , taking equispaced points on each subinterval. We then take $\approx N/2$ more points on $[-1,1]$ , outside the subintervals, again uniformly, i.e., the grid is much coarser (see Fig. 7 (right)). We take the weights $w_{j}$ according to the midpoint rule. We thus take a mesh $O(1/N^{2})$ rather than $O(1/N)$ near the roots, while still having a $O(1/N)$ mesh elsewhere. This refinement of the quadrature rule is observed to improve the accuracy to $\|\smash{p_{n}^{\ell_{1}}\!}-\smash{p_{n}^{L_{1}}\!}\|_{1}=\mathcal{O}(N^{-2})$ , as the quadrature error at the roots have been improved from $\mathcal{O}(N^{-2})$ to $\mathcal{O}(N^{-4})$ . We then solve Eq. 13 by a standard technique of casting it as linear programming (LP) [13], namely

[TABLE]

Note that we do not use SPGL1 or the Chebyshev points from Eq. 8 in the refinement stage. This is because SPGL1 requires the computation of the null space $V^{\top}$ , which can be more expensive. Due to the sparsity structure of LP, we find that the MOSEK optimization toolbox [3] (using its MATLAB interface) offers an efficient solver.

In Fig. 7 (left) we show the error $\|\smash{p_{n}^{L_{1}}\!}-\smash{p_{n}^{\ell_{1}}\!}\|_{1}$ with the LP solution for $10^{2}\leq N\leq 10^{4}$ , with and without the refinement. Note that the number of decision variables in LP Eq. 25 is $2(N+1)+n+1$ , with $4(N+1)$ inequality constraints.

5.3 Iterative procedure: Newton’s method

To improve the initial guess obtained in Section 5.2 we employ Newton’s method based on the ideas in Watson’s algorithm [34, Sec. 4], which is a globally convergent (under mild assumptions) iterative method for computing $\smash{p_{n}^{L_{1}}\!}$ when the set $S=\{x\in[-1,1]:f(x)=\smash{p_{n}^{L_{1}}\!}(x)\}$ has zero Lebesgue measure. We assume this below; otherwise $f$ was a corrupted polynomial, which would be detected by Eq. 13 if the corruption is small.

When the set $S$ has zero Lebesgue measure, an alternative characterization of $\smash{p_{n}^{L_{1}}\!}$ is [27, Thm. 14.1]

[TABLE]

for all $q\in\mathcal{P}_{n}$ . We propose to apply Newton’s method to Eq. 26. By using the Chebyshev polynomials of the second kind as a basis for $\mathcal{P}_{n}$ , we define a vector-valued operator $L:\mathbb{R}^{n+1}\mapsto\mathbb{R}^{n+1}$ given by

[TABLE]

We note that $L[(c_{0}^{*},\ldots,c_{n}^{*})^{\top}]=\underline{0}$ if and only if $\smash{p_{n}^{L_{1}}\!}=\sum_{j=0}^{n}c_{j}^{*}U_{j}$ from Eq. 26, and we propose to use Newton’s method on $L$ to find it.

Newton’s method tells us to perform the following iteration:

[TABLE]

Moreover, it can be shown that $J_{k}$ can be expressed as [34]

[TABLE]

where $r_{1},\ldots,r_{K}$ are the roots of $e(x)$ and $V_{k}$ is the Chebyshev–Vandermonde matrix at $r_{1},\ldots,r_{K}$ , i.e., $(V_{k})_{i,j}=U_{j}(r_{i})$ .

At the $k$ th Newton iteration, we must calculate the roots of $e_{k}(x)=f(x)-\smash{\sum_{t=0}^{n}c_{t}^{(k)}U_{t}(x)}$ , evaluate $\mu_{j}$ for $0\leq j\leq n$ and $e_{k}^{\prime}(x)$ at $r_{1},\ldots,r_{K}$ , form $J_{k}$ using Eq. 29, and then solve an $(n+1)\times(n+1)$ dense linear system where the righthand side is $L[\underline{c}]$ . All these operations can be performed conveniently and robustly in Chebfun to an accuracy of essentially machine precision [16]. The dominant computation in each Newton’s step lies either in the evaluation of $\mu_{j}$ in (27), which costs $\mathcal{O}(nm^{2})$ where $m$ is the Chebfun degree of $f$ , or the linear system $\mathcal{O}(n^{3})$ , for a total of $\mathcal{O}(n^{2}(m+n))$ complexity. Typically Newton converges within a handful of iterations.

As Watson notes [34], a small modification for the formula for $J_{k}$ in Eq. 29 is required when $e_{k}^{\prime}(r_{j})=0$ for some $r_{j}$ , e.g., set $J=I$ , or when $V$ is rank-deficient, e.g., set $J:=J+\delta I$ for some small $\delta>0$ . Under mild restrictions, this modified Newton’s method generically converges to $\smash{p_{n}^{L_{1}}\!}$ at a quadratic rate [34].

5.4 Stopping criterion: Near-best condition

It is important to have a stopping criterion to determine when Newton’s method in Eq. 28 should be terminated. The simplest criterion could be to stop computing iterates as soon as $\|\underline{c}^{(k+1)}-\underline{c}^{(k)}\|_{2}<\epsilon\|\underline{c}^{(k)}\|_{2}$ , where $\epsilon>0$ is a small parameter. However, we prefer to stop Newton’s method as soon as $\max_{0\leq i\leq n}\left|(L[\underline{c}^{(k)}])_{i}\right|<\epsilon\|f\|_{1}$ because it leads to a near-best guarantee.

Theorem 5.1.

Let $f:[-1,1]\rightarrow\mathbb{R}$ be a continuous function and $\underline{c}\in\mathbb{R}^{n+1}$ . If $\tfrac{2}{\pi}(n+2)^{2}\max_{0\leq i\leq n}\!\left|(L[\underline{c}])_{i}\right|\!<\!1$ , then

[TABLE]

*where $U_{j}$ is the degree $j$ Chebyshev polynomial of the second kind. *

Proof 5.2.

Let $p_{n}\in\mathcal{P}$ and define $s_{p}(x)=\pm{\rm sign}(f(x)-p_{n}(x))$ so that $\|f-p_{n}\|_{1}=\int_{-1}^{1}s_{p}(x)(f(x)-p_{n}(x))dx$ . Then,

[TABLE]

Therefore, we find that $\|f-p_{n}\|_{1}\leq\|f-\smash{p_{n}^{L_{1}}\!}\|_{1}+\int_{-1}^{1}s_{p}(x)(\smash{p_{n}^{L_{1}}\!}(x)-p_{n}(x))dx$ . Expanding $\smash{p_{n}^{L_{1}}\!}-p_{n}$ in a Chebyshev series, we find that

[TABLE]

Since $|U_{i}(x)|\leq(i+1)$ for $x\in[-1,1]$ [24, (18.14.4) & (18.7.4)], we have

[TABLE]

where the last inequality comes from the fact that $\|\smash{p_{n}^{L_{1}}\!}-p_{n}\|_{1}\leq\|\smash{p_{n}^{L_{1}}\!}-f\|_{1}+\|f-p_{n}\|_{1}\leq 2\|f-p_{n}\|_{1}$ . It follows that

[TABLE]

*where the inequality holds since $\sum_{i=0}^{n}(i+1){\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}=(n+1)(n+2)/2\leq(n+2)^{2}/2}$ . By using Eq. 32 to bound the righthand side of Eq. 31, the result follows by rearranging. *

Theorem 5.1 shows that one can track the quantity $\max_{0\leq i\leq n}\left|(L[\underline{c}^{(k)}])_{i}\right|$ for $k\geq 0$ to estimate how close the current Newton iterate is to computing $\smash{p_{n}^{L_{1}}\!}$ . In practice, we terminate Newton’s method as soon as $\max_{0\leq i\leq n}\left|(L[\underline{c}^{(k)}])_{i}\right|<10^{-14}\|f\|_{1}$ . It can happen that the initial guess in Section 5.2 already satisfies the stopping criteria in which case no Newton iterations are computed.

Acknowledgments

We thank Laurent Demanet for discussing the implications of the Remez inequality with us. We also thank Vanni Noferini who was present during the initial discussions of this work. We thank Nick Trefethen and Heather Wilber for reading a draft of this manuscript and improving the text.

Appendix A Corruption away from the endpoints

Lemma 2.4 shows that polynomials of degree $\leq n$ cannot be too concentrated in a set of measure $<\min(1,1/(4n^{2}))$ , which is a consequence of the fact that $|p^{\prime}(x)|\leq n^{2}\|p\|_{\infty}$ for any $p\in\mathcal{P}_{n}$ . An alternative bound on the derivative of a polynomial is [6, Ch. 5]

[TABLE]

for any $p\in\mathcal{P}_{n}$ . This inequality is better when $x$ is away from $\pm 1$ , and suggests that polynomials of degree $\leq n$ are less concentrated in the middle of $[-1,1]$ compared to near $\pm 1$ . This turns out to be the case.

Theorem A.1.

Let $\Omega_{s}\subseteq[-1,1]$ with Lebesgue measure $s\geq 0$ and suppose that $\zeta=\max\{|x|:x\in\Omega_{s}\}$ is such that $1-\zeta\geq 1/n$ . For $n\geq 1$ , we have

[TABLE]

*for any polynomial $p\in\mathcal{P}_{n}$ .

Proof A.2.

Let $p\in\mathcal{P}_{n}$ and let $\|p\|_{\Omega_{s}}$ denote its absolute maximum in $\Omega_{s}$ . By Bernstein’s inequality [6, Ch. 5] we have that $|p^{\prime}(x)|\leq n\|p\|_{\infty}/\sqrt{1-\zeta^{2}}$ for $x\in\Omega_{s}$ and $|p^{\prime}(x)|\leq n^{2}\|p\|_{\infty}$ for $x\in[-1,1]$ . Let $x^{*}\in[-1,1]$ be such that $|p(x^{*})|=\|p\|_{\infty}$ . Using these two inequalities, we observe that there is an interval $\mathcal{I}\subset[-1,1]$ containing $x^{*}$ of width at least $1/n^{2}$ for which $p(x)$ is of the same sign as $p(x^{*})$ . The area of the triangle of width $1/n^{2}$ and height $|p(x^{*})|$ is $I_{1}=|p(x_{*})|/(2n^{2})$ . Next use the same argument for $x^{*,\Omega}\in\Omega_{s}$ such that $|p(x^{*,\Omega})|=\|p\|_{\Omega_{s}}$ , to obtain a triangle with area $\|p\|_{\Omega_{s}}^{2}\sqrt{1-\zeta^{2}}/(2\|p\|_{\infty}n)$ . Note that since $1-\zeta\geq 1/n$ , the two triangles can be chosen to not overlap. We can thus write $\int_{-1}^{1}|p(x)|dx\geq I_{1}+I_{2}$ .

Since $\int_{\Omega_{s}}|p(x)|dx\leq s\|p\|_{\Omega_{s}}$ , we find that

[TABLE]

*The function $g(x)=x/(2n^{2})+\sqrt{1-\zeta^{2}}/(2nx)$ on $x\geq 0$ is minimized at $x_{*}=\sqrt{n}(1-\zeta^{2})^{1/4}$ . The bound in Eq. 33 holds since $g(x)\geq g(x_{*})=(1-\zeta^{2})^{1/4}n^{-3/2}$ for any $x\geq 0$ . *

Arguing as in Corollary 2.6, Theorem A.1 means that a corrupted polynomial of degree $n\geq 1$ can be exactly recovered by $\smash{p_{n}^{L_{1}}\!}$ when $s<(1-\zeta^{2})^{1/4}n^{-3/2}/2$ and $1-\zeta\geq 1/n$ . For sufficiently large $n$ , this is a relaxation of the requirements for exact recovery in Section 2 when the corruption is away from $\pm 1$ (see Fig. 2 (c) and the localized error near $x=\pm 1$ in Fig. 5). Other results in Section 3 can be relaxed by using Theorem A.1 under the restriction that the corruption occurs away from $\pm 1$ . In particular, one can show that if $\Omega_{s}=[-s/2,s/2]$ with $s=n^{-3/2}/32$ , then

[TABLE]

where $p_{n}^{*}$ is the best $L_{1}$ polynomial approximation of $f_{0}$ on $[-1,1]$ .

Theorem A.1 also encourages us to wildly speculate (recalling the derivation of Corollary 2.6) that the error localization of $f-\smash{p_{n}^{L_{1}}\!}$ is usually more concentrated for functions with endpoint singularities, i.e., $|\Omega_{n}|=\mathcal{O}(n^{-2})$ , and less concentrated for functions with singularities away from $\pm 1$ , i.e., $|\Omega_{n}|=\mathcal{O}(n^{-1.5})$ or even $\mathcal{O}(n^{-1})$ .

Bibliography35

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] B. Adcock, A. Bao, and S. Brugiapaglia. Correcting for unknown errors in sparse high-dimensional function approximation. Numerische Mathematik , 142(3):667–711, 2019.
2[2] B. Adcock, S. Brugiapaglia, and C. G. Webster. Compressed sensing approaches for polynomial approximation of high-dimensional functions. In Compressed Sensing and its Applications , pages 93–124. Springer, 2017.
3[3] MOSEK. Ap S. The MOSEK optimization toolbox for MATLAB manual. Version 8.1. , 2017.
4[4] Z. Battles and L. N. Trefethen. An extension of MATLAB to continuous functions and operators. SIAM J. Sci. Comp. , 25(5):1743–1770, 2004.
5[5] Y. Benyamini, A. Kroó, and A. Pinkus. l 1 superscript 𝑙 1 l^{1} -approximation and finding solutions with small support. Constructive Approximation , 36(3):399–431, 2012.
6[6] P. Borwein and T. Erdélyi. Polynomials and Polynomial Inequalities , volume 161. Springer Science & Business Media, 2012.
7[7] J. P. Boyd. Computing zeros on a real interval through Chebyshev expansion and polynomial rootfinding. SIAM J. Numer. Anal. , 40(5):1666–1682, 2002.
8[8] H. Brass. A remark on best L 1 superscript 𝐿 1 L^{1} -approximation by polynomials. J. Approx. Theory , 52:359–361, 1988.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Error localization of best L1 polynomial approximants††thanks: Submitted to the editors .

Abstract

keywords:

1 Introduction

Definition 1.1**.**

2 Exact recovery of corrupted polynomials

Theorem 2.1**.**

2.1 Exact recovery with best L0\mathbf{L_{0}}L0​ approximation

2.2 Exact recovery with best ℓ0\mathbf{\ell_{0}}ℓ0​ approximation

2.3 Exact recovery with best ℓ1\mathbf{\ell_{1}}ℓ1​ approximation

Proposition 2.2**.**

Proof 2.3**.**

2.4 Exact recovery with best L1\mathbf{L_{1}}L1​ approximation

Lemma 2.4**.**

Proof 2.5**.**

Corollary 2.6**.**

Proof 2.7**.**

3 Near-recovery of corrupted smooth functions

Theorem 3.1**.**

Proof 3.2**.**

3.1 Related studies

4 Error localization of best L1\mathbf{L_{1}}L1​ polynomial approximants

4.1 Error localization of best L1\mathbf{L_{1}}L1​ approximants to 1−x2\mathbf{\sqrt{1-x^{2}}}1−x2​

4.2 Error localization of best L1\mathbf{L_{1}}L1​ approximants to ∣x∣\mathbf{|x|}∣x∣

5 A globally convergent algorithm for computing best L1\mathbf{L_{1}}L1​ polynomial approximants

5.1 Initial attempt: The Chebyshev interpolant

5.2 Test for corrupted polynomials and initial guess: Compute ℓ1\mathbf{\ell_{1}}ℓ1​ minimizer

5.2.1 Refinement: Reducing the discretization error

5.3 Iterative procedure: Newton’s method

5.4 Stopping criterion: Near-best condition

Theorem 5.1**.**

Proof 5.2**.**

Acknowledgments

Appendix A Corruption away from the endpoints

Theorem A.1**.**

Proof A.2**.**

Definition 1.1.

Theorem 2.1.

2.1 Exact recovery with best $\mathbf{L_{0}}$ approximation

2.2 Exact recovery with best $\mathbf{\ell_{0}}$ approximation

2.3 Exact recovery with best $\mathbf{\ell_{1}}$ approximation

Proposition 2.2.

Proof 2.3.

2.4 Exact recovery with best $\mathbf{L_{1}}$ approximation

Lemma 2.4.

Proof 2.5.

Corollary 2.6.

Proof 2.7.

Theorem 3.1.

Proof 3.2.

4 Error localization of best $\mathbf{L_{1}}$ polynomial approximants

4.1 Error localization of best $\mathbf{L_{1}}$ approximants to $\mathbf{\sqrt{1-x^{2}}}$

4.2 Error localization of best $\mathbf{L_{1}}$ approximants to $\mathbf{|x|}$

5 A globally convergent algorithm for computing best $\mathbf{L_{1}}$ polynomial approximants

5.2 Test for corrupted polynomials and initial guess: Compute $\mathbf{\ell_{1}}$ minimizer

Theorem 5.1.

Proof 5.2.

Theorem A.1.

Proof A.2.