Higher-Order Total Directional Variation: Imaging Applications

Simone Parisotto; Jan Lellmann; Simon Masnou; Carola-Bibiane; Sch\"onlieb

arXiv:1812.05023·math.NA·July 10, 2020·SIAM J. Imaging Sci.

Higher-Order Total Directional Variation: Imaging Applications

Simone Parisotto, Jan Lellmann, Simon Masnou, Carola-Bibiane, Sch\"onlieb

PDF

1 Repo

TL;DR

This paper introduces higher-order anisotropic total variation regularisers that extend TGV, enabling better preservation of anisotropic features in images across various applications like denoising and surface reconstruction.

Contribution

It presents a new class of regularisers for imaging that generalize TGV to anisotropic, inhomogeneous cases, with a numerical approach for gradient flow approximation.

Findings

01

Enhanced anisotropic feature preservation in images

02

Effective application to denoising, zooming, and surface reconstruction

03

Numerical method demonstrates practical viability

Abstract

We introduce a class of higher-order anisotropic total variation regularisers, which are defined for possibly inhomogeneous, smooth elliptic anisotropies, that extends the Total Generalized Variation (TGV) regulariser and its variants. We propose a primal-dual hybrid gradient approach to approximate numerically the associated gradient flow. This choice of regularisers allows to preserve and enhance intrinsic anisotropic features in images. This is illustrated on various examples from different imaging applications: image denoising, wavelet-based image zooming, and reconstruction of surfaces from scattered height measurements.

Equations280

\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q}({u,\pazocal{M}}):=\sup_{{\bm{\Psi}}}\left\{\int_{\Omega}u\operatorname{div}_{\pazocal{M}}^{q}{\bm{\Psi}}\mathop{}\mathrm{d}{\bm{x}}\,\Big{\lvert}\,\text{ for all }{\bm{\Psi}}\in\pazocal{Y}_{\pazocal{M},{\bm{\alpha}}}^{q}\right\},

\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q}({u,\pazocal{M}}):=\sup_{{\bm{\Psi}}}\left\{\int_{\Omega}u\operatorname{div}_{\pazocal{M}}^{q}{\bm{\Psi}}\mathop{}\mathrm{d}{\bm{x}}\,\Big{\lvert}\,\text{ for all }{\bm{\Psi}}\in\pazocal{Y}_{\pazocal{M},{\bm{\alpha}}}^{q}\right\},

\pazocal Y_{\pazocal M, α}^{q} = {Ψ : Ψ \in C_{c}^{q} (Ω, \pazocal T^{q} (R^{d})), div_{\pazocal M}^{j} Ψ_{\infty} \leq α_{j}, \forall j = 0, \dots q - 1},

\pazocal Y_{\pazocal M, α}^{q} = {Ψ : Ψ \in C_{c}^{q} (Ω, \pazocal T^{q} (R^{d})), div_{\pazocal M}^{j} Ψ_{\infty} \leq α_{j}, \forall j = 0, \dots q - 1},

J_{ρ} (u) := K_{ρ} * (\nabla u_{σ} \otimes \nabla u_{σ}),

J_{ρ} (u) := K_{ρ} * (\nabla u_{σ} \otimes \nabla u_{σ}),

\int_{Ω} w (x) J_{ρ} (u) d x,

\int_{Ω} w (x) J_{ρ} (u) d x,

u_{1}, u_{2} u_{1} + u_{1} = u min α (v_{1}^{T} \nabla u_{1}_{1} + v_{2}^{T} \nabla u_{2}_{1}) + \frac{1}{2} ∥ u - u^{⋄} ∥_{2}^{2} .

u_{1}, u_{2} u_{1} + u_{1} = u min α (v_{1}^{T} \nabla u_{1}_{1} + v_{2}^{T} \nabla u_{2}_{1}) + \frac{1}{2} ∥ u - u^{⋄} ∥_{2}^{2} .

STV_{p} (u) = (λ_{1}, λ_{2})_{p} .

STV_{p} (u) = (λ_{1}, λ_{2})_{p} .

\pazocal R_{J} (u) = g_{γ} (S \nabla u), S = max (γ, 4 λ_{1} + λ_{2} Λ^{- 0.5} Q^{T}),

\pazocal R_{J} (u) = g_{γ} (S \nabla u), S = max (γ, 4 λ_{1} + λ_{2} Λ^{- 0.5} Q^{T}),

TV^{α} (u) = \int_{Ω} ∣ M_{α} \nabla u ∣ d x,

TV^{α} (u) = \int_{Ω} ∣ M_{α} \nabla u ∣ d x,

TV_{a, θ} (u) = i, j \sum Ψ \in E^{a, θ} sup ⟨(\nabla u)_{i, j}, Ψ ⟩ .

TV_{a, θ} (u) = i, j \sum Ψ \in E^{a, θ} sup ⟨(\nabla u)_{i, j}, Ψ ⟩ .

(cos θ, sin θ) = \frac{( \nabla u _{σ} ) ^{⊥}}{∥ \nabla u _{σ} ∥ _{2}},

(cos θ, sin θ) = \frac{( \nabla u _{σ} ) ^{⊥}}{∥ \nabla u _{σ} ∥ _{2}},

EADTV_{α, θ} (u) = i, j \sum Ψ \in E^{α, θ_{ij}} sup ⟨(\nabla u)_{i, j}, Ψ ⟩ .

EADTV_{α, θ} (u) = i, j \sum Ψ \in E^{α, θ_{ij}} sup ⟨(\nabla u)_{i, j}, Ψ ⟩ .

dTV (u) = n = 1 \sum N \pazocal P_{ξ_{n}} \nabla u_{n}, with ξ_{n} := \nabla u_{n} / (K_{σ} * ∣ \nabla u_{n} ∣),

dTV (u) = n = 1 \sum N \pazocal P_{ξ_{n}} \nabla u_{n}, with ξ_{n} := \nabla u_{n} / (K_{σ} * ∣ \nabla u_{n} ∣),

DTV (u)

DTV (u)

DTGV_{α}^{q} (u)

u^{*} \in u arg min q = 1 \sum Q TDV_{α_{q}}^{q} (u, \pazocal M_{q}) + \frac{η}{2} ∥ \pazocal S u - u^{⋄} ∥_{2}^{2},

u^{*} \in u arg min q = 1 \sum Q TDV_{α_{q}}^{q} (u, \pazocal M_{q}) + \frac{η}{2} ∥ \pazocal S u - u^{⋄} ∥_{2}^{2},

either M_{j} = I or M_{j} = Λ_{b} (R_{θ})^{T},

either M_{j} = I or M_{j} = Λ_{b} (R_{θ})^{T},

Λ_{b} = (b_{1} (x) 0 0 b_{2} (x)) and R_{θ} = (cos θ (x) sin θ (x) - sin θ (x) cos θ (x)),

Λ_{b} = (b_{1} (x) 0 0 b_{2} (x)) and R_{θ} = (cos θ (x) sin θ (x) - sin θ (x) cos θ (x)),

M_{1} \nabla \otimes u = Λ_{b} (R_{θ})^{T} \nabla \otimes u = (b_{1} \nabla_{v} u b_{2} \nabla_{v_{⊥}} u) .

M_{1} \nabla \otimes u = Λ_{b} (R_{θ})^{T} \nabla \otimes u = (b_{1} \nabla_{v} u b_{2} \nabla_{v_{⊥}} u) .

M_{1} \nabla u \cdot Ψ = \nabla u \cdot M_{1}^{T} Ψ,

M_{1} \nabla u \cdot Ψ = \nabla u \cdot M_{1}^{T} Ψ,

\pazocal T^{ℓ} (R^{d})

\pazocal T^{ℓ} (R^{d})

(ξ_{1} \otimes ξ_{2}) (a_{1}, \dots, a_{ℓ_{1} + ℓ_{2}}) = ξ_{1} (a_{1}, \dots, a_{ℓ_{1}}) ξ_{2} (a_{ℓ_{1} + 1}, \dots, a_{ℓ_{1} + ℓ_{2}});

(ξ_{1} \otimes ξ_{2}) (a_{1}, \dots, a_{ℓ_{1} + ℓ_{2}}) = ξ_{1} (a_{1}, \dots, a_{ℓ_{1}}) ξ_{2} (a_{ℓ_{1} + 1}, \dots, a_{ℓ_{1} + ℓ_{2}});

trace (ξ) (a_{1}, \dots, a_{ℓ - 2}) = i = 1 \sum d ξ (e_{i}, a_{1}, \dots, a_{ℓ - 2}, e_{i}),

trace (ξ) (a_{1}, \dots, a_{ℓ - 2}) = i = 1 \sum d ξ (e_{i}, a_{1}, \dots, a_{ℓ - 2}, e_{i}),

ξ \cdot η = p \in {1, \dots, d}^{ℓ} \sum ξ_{p_{1}, \dots, p_{ℓ}} η_{p_{1}, \dots, p_{ℓ}} .

ξ \cdot η = p \in {1, \dots, d}^{ℓ} \sum ξ_{p_{1}, \dots, p_{ℓ}} η_{p_{1}, \dots, p_{ℓ}} .

\nabla \otimes ξ := (\partial_{j} ξ_{i_{1}, \dots, i_{ℓ}})_{j, i_{1}, \dots, i_{ℓ}} .

\nabla \otimes ξ := (\partial_{j} ξ_{i_{1}, \dots, i_{ℓ}})_{j, i_{1}, \dots, i_{ℓ}} .

η \nabla \otimes ξ := (k = 1 \sum d η_{j, k} \partial_{k} ξ_{i_{1}, \dots, i_{ℓ}})_{j, i_{1}, \dots, i_{ℓ}} .

η \nabla \otimes ξ := (k = 1 \sum d η_{j, k} \partial_{k} ξ_{i_{1}, \dots, i_{ℓ}})_{j, i_{1}, \dots, i_{ℓ}} .

∥ u ∥_{\infty, q} = ℓ = 0, \dots, q max x \in Ω sup \nabla^{ℓ} \otimes u (x),

∥ u ∥_{\infty, q} = ℓ = 0, \dots, q max x \in Ω sup \nabla^{ℓ} \otimes u (x),

\int_{Ω} (M \nabla \otimes A) \cdot Ψ d x,

\int_{Ω} (M \nabla \otimes A) \cdot Ψ d x,

\int_{Ω} (M \nabla \otimes A) \cdot Ψ d x = \int_{Ω} (\nabla \otimes A) \cdot trace (M \otimes Ψ^{\sim}) d x, for all M, A, Ψ .

\int_{Ω} (M \nabla \otimes A) \cdot Ψ d x = \int_{Ω} (\nabla \otimes A) \cdot trace (M \otimes Ψ^{\sim}) d x, for all M, A, Ψ .

\int_{Ω} (M \nabla \otimes A) \cdot Ψ d x = - \int_{Ω} A \cdot div_{M} Ψ d x, for all M, A, Ψ,

\int_{Ω} (M \nabla \otimes A) \cdot Ψ d x = - \int_{Ω} A \cdot div_{M} Ψ d x, for all M, A, Ψ,

\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q}({u,\pazocal{M}}):=\sup_{\bm{\Psi}}\left\{\int_{\Omega}u\operatorname{div}_{\pazocal{M}}^{q}{\bm{\Psi}}\mathop{}\mathrm{d}{\bm{x}}\,\Big{\lvert}\,\text{for all }{\bm{\Psi}}\in\pazocal{Y}_{\pazocal{M},{\bm{\alpha}}}^{q}\right\},

\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q}({u,\pazocal{M}}):=\sup_{\bm{\Psi}}\left\{\int_{\Omega}u\operatorname{div}_{\pazocal{M}}^{q}{\bm{\Psi}}\mathop{}\mathrm{d}{\bm{x}}\,\Big{\lvert}\,\text{for all }{\bm{\Psi}}\in\pazocal{Y}_{\pazocal{M},{\bm{\alpha}}}^{q}\right\},

\pazocal Y_{\pazocal M, α}^{q} = {Ψ : Ψ \in C_{c}^{q} (Ω, \pazocal T^{q} (R^{d})), div_{\pazocal M}^{j} Ψ_{\infty} \leq α_{j}, \forall j = 0, \dots q - 1},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

simoneparisotto/TDV-for-image-denoising
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\headers

Higher-Order Total Directional Variation: Imaging ApplicationsS. Parisotto, J. Lellmann, S. Masnou and C.-B. Schönlieb

\stackMath

\epstopdfDeclareGraphicsRule.tifpng.pngconvert #1 \OutputFile \AppendGraphicsExtensions.tif \epstopdfDeclareGraphicsRule.tifpng.pngconvert #1 \OutputFile \AppendGraphicsExtensions.tif

\newsiamremarkexampleExample \newsiamremarkremarkRemark

Higher-Order Total Directional Variation: Imaging Applications

††thanks: Submitted to the editors DATE. \funding SP acknowledges UK EPSRC grant EP/L016516/1 for the University of Cambridge, Cambridge Centre for Analysis DTC. CBS acknowledges support from the EPSRC grants Nr. EP/M00483X/1, EP/K009745/1, the EPSRC centre EP/N014588/1, the Leverhulme Trust project ’Breaking the non-convexity barrier’, the Alan Turing Institute TU/B/000071, the CHiPS (Horizon 2020 RISE project grant), the Isaac Newton Institute and the Cantab Capital Institute for the Mathematics of Information. SM acknowledges support from the Labex MILYON/ANR-10-LABX-0070 and the ANR-14-CE27-0019 MIRIAM project grant

Simone Parisotto CCA, University of Cambridge, Wilberforce Road, Cambridge CB3 0WA, UK () [email protected]

Jan Lellmann MIC, University of Lübeck, Maria-Goeppert-Straße 3, 23562 Lübeck, DE ([email protected])

Simon Masnou Université Claude Bernard Lyon 1, Institut Camille Jordan, Lyon, France () [email protected]

Carola-Bibiane Schönlieb DAMTP, University of Cambridge, Wilberforce Road, Cambridge CB3 0WA, UK ([email protected])

Abstract

We introduce a class of higher-order anisotropic total variation regularisers, which are defined for possibly inhomogeneous, smooth elliptic anisotropies, that extends the Total Generalized Variation (TGV) regulariser and its variants. We propose a primal-dual hybrid gradient approach to approximate numerically the associated gradient flow. This choice of regularisers allows to preserve and enhance intrinsic anisotropic features in images. This is illustrated on various examples from different imaging applications: image denoising, wavelet-based image zooming, and reconstruction of surfaces from scattered height measurements.

keywords:

Total directional variation, Anisotropy, Denoising, Wavelet-based zooming, Digital Elevation Map

{AMS}

47A52, 49M30, 49N45, 65J22, 94A08

1 Introduction

In the last decades, total variation ( $\mathrm{{TV}}$ ) regularisation has been successfully applied to a variety of imaging problems. In particular since [45], $\mathrm{{TV}}$ plays a crucial role for variational image denoising, deblurring, inpainting, segmentation, magnetic resonance image (MRI) reconstruction and many others, see [12]. While the $\mathrm{{TV}}$ regulariser successfully eliminates noise and at the same time preserves characteristic image features like edges, it still has some shortcomings. A major one is the staircasing effect, resulting into blocky-like images [13, 38]. One approach to mitigate this effect is based on higher order total variation regularisers, see e.g. [17, 18, 40, 49, 60], aiming to eliminate the staircasing effect by higher regularity in homogeneous regions of the image while still allowing for discontinuities in the presence of edges. The total generalized variation ( $\mathrm{{TGV}}_{{\bm{\alpha}}}^{q}$ ) regulariser has been proposed in [11] to balance the first $q$ derivatives of $u$ with a regularisation parameter vector ${\bm{\alpha}}$ . Another modification of the $\mathrm{{TV}}$ regulariser has been the introduction of directional information in the regularisation, allowing to smooth images in an anisotropic fashion favouring preferred directions, e.g. [5, 62, 7, 22, 51, 27, 35, 33, 24, 23]. A recent combination of directional $\mathrm{{TV}}$ and higher-order derivatives is the directional total generalized variation [20] that equips the $\mathrm{{TGV}}_{{\bm{\alpha}}}^{q}$ regulariser with one constant preferred smoothing direction.

In this paper we extend the directional total generalized variation introduced in [20] and the third-order directional total variation regulariser introduced in [34] to a new class of directional total variation regularisers that can feature a combination of orders of derivatives as well as spatially-varying directional information by means of weighting the derivatives in the $\mathrm{{TGV}}_{{\bm{\alpha}}}^{q}$ regulariser with $2$ -tensors. We introduce this class of total generalized variation regularisers, discuss its numerical solution by a tailored primal-dual hybrid gradient scheme (which replaces the commercial CVX+Mosek solver used in [34]), and showcase its performance and regularisation properties for a range of imaging problems. For the latter, we show the effect of this generalized class of regularisers in different imaging applications where the introduced anisotropy plays a crucial role: image denoising, wavelet-based zooming and digital elevation map (DEM) interpolation with applications to atomic force microscopy (AFM) data, see Fig. 1. For the application of our regularisers to video denoising, we refer to [43] and in general to [41]. The theoretical foundation for this general class of directional, higher-order regularisers is presented in our companion work [42].

Let us go into more details. Let $\Omega\subset\mathbb{R}^{d}$ be a bounded Lipschitz domain for $d\geq 1$ and $u:\Omega\to\mathbb{R}$ a function, we define the the higher-order directional total variation of $u$ as

[TABLE]

where we call $q$ the order of the regularisation, $\pazocal{M}$ is a collection of weighting fields, and

[TABLE]

where $\pazocal{T}^{q}(\mathbb{R}^{d})$ is the vector space of $q$ -tensors in $\mathbb{R}^{d}$ and ${\bm{\alpha}}$ is a vector of regularisation parameters. We will provide the rigorous definition for Eq. 1 in Section 2.2. We comment for now that the regulariser in Eq. 1 is designed for introducing weighted directional derivatives in the classical definition of $\mathrm{{TGV}}_{{\bm{\alpha}}}^{q}$ . The anisotropy is introduced by a family of weights $\pazocal{M}$ and a thereby suitably weighted divergence $\operatorname{div}{M}^{q}$ of order $q$ , defined in Eq. 21.

1.1 Related work

In what follows we review the state-of-the-art that is most relevant for the proposed higher-order directional total variation regulariser. We focus in particular on functional regularisers but it is worth mentioning that there is a rich literature on fairly general anisotropic PDE models, mainly of first and second order, see for instance [46, 53, 57, 31, 9, 14, 3, 61] and the references therein. Our model handles a more limited class of anisotropies but it can do it at any order of derivatives, particularly useful in various applications.

The idea of anisotropic smoothing for imaging has certainly been popularized by the book of Weickert on anisotropic diffusion equations [57] based on the key notion of structure tensor to encode directional information, see also [29, 26, 30]. Weickert's structure tensor for a continuous imaging function $u:\Omega\rightarrow\mathbb{R}$ and non-negative parameters $\sigma,\rho$ is defined as

[TABLE]

where $u_{\sigma}=K_{\sigma}\ast u$ and $K_{\sigma},K_{\rho}$ are Gaussian kernels with standard deviations $\sigma,\rho$ , respectively. For $\operatorname{\bm{\nabla}}u_{\sigma}\neq 0$ the structure tensor $\mathbf{J}_{\rho}(u)$ has two orthogonal eigenvectors ${\bm{v}}_{1}$ and ${\bm{v}}_{2}$ with corresponding non-zero real eigenvalues $\lambda_{1}({\bm{x}})$ and $\lambda_{2}({\bm{x}})$ . Here ${\bm{v}}_{1}$ and ${\bm{v}}_{2}$ approximately point in the direction $\operatorname{\bm{\nabla}}u_{\sigma}$ and $\operatorname{\bm{\nabla}}^{\perp}u_{\sigma}$ . From this, diffusion tensors can be constructed inheriting ${\bm{v}}_{1}$ and ${\bm{v}}_{2}$ as eigenvectors but whose eigenvalues are expressions of $\lambda_{1}$ and $\lambda_{2}$ so as to increase or reduce smoothing in these directions, compare for instance coherence-enhancing diffusion [58]. The concept of structure tensor is used for variational regularisation in [51] in the framework of a single orientation estimation approach. More precisely, the authors consider a regulariser of the type

[TABLE]

for a non-negative weight function $w:\Omega\to\mathbb{R}$ and a continuous imaging function $u:\Omega\to\mathbb{R}$ , for smoothing an image into a dominant single direction. For smoothing a noisy data $u^{\diamond}$ in two directions, the authors propose to estimate directions ${\bm{v}}_{1}$ and ${\bm{v}}_{2}$ as in [1] and, in their double orientation estimation approach, decompose $u({\bm{x}})=u_{1}({\bm{x}})+u_{2}({\bm{x}})$ via

[TABLE]

An approach based on the analysis of eigenvalues and eigenvectors of the structure tensors can be found in [27] while in [35] the admissible set of test functions are locally adapted to the geometry of ${\bm{u}}$ via the support function regulariser. Furthermore, in [33], the structure tensor total variation (STV) focuses on the nuclear norm of the structure tensor Eq. 3 in order to measure the local image variation:

[TABLE]

Also in [24], a regulariser is proposed whose smoothing directions vary according to the image content, leading to the analysis of

[TABLE]

where $g_{\gamma}$ is the Huber regularisation with parameter $\gamma>0$ and the structure tensor $\mathbf{J}_{\rho}$ is eigen-decomposed as $\mathbf{J}_{\rho}=\mathbf{Q}{\bm{\Lambda}}\mathbf{Q}^{\mathrm{T}}$ , with eigenvectors stored in the matrix $\mathbf{Q}$ and the eigenvalues in the diagonal matrix ${\bm{\Lambda}}$ .

Let us also mention that early works where the gradient is weighted date back to [7], where the oriented local image structure is extracted from images by the regulariser

[TABLE]

with $\mathbf{M}_{\alpha}$ being the orthogonal rotation matrix for an angle $\alpha>0$ .

Further, in [5], a discrete directional total variation ( $\mathrm{{TV}}_{a,\theta}$ ) regulariser for denoising discrete images ${\bm{u}}$ with a single dominant direction (directional images) is introduced via affine transformations of test functions: the circular unit ball generated by the $\mathrm{{L}}^{2}$ -norm is transformed into an ellipse $E^{a,\theta}$ , with major semiaxis $a>1$ rotated by $\theta$ , penalizing variations for large $a$ along $\theta$ :

[TABLE]

In a straightforward generalization of Eq. 7, inhomogeneous fields $\theta$ are allowed, namely ${\bm{\theta}}:=\theta({\bm{x}})$ . In [62] the authors propose to adapt ${\bm{\theta}}$ to the edge directions,

[TABLE]

so as to associate at each pixel position $(i,j)$ a specific ellipsoid ball $E^{a,\theta_{ij}}$ for the test functions, leading to the discrete edge adaptive directional total variation (EADTV) regulariser:

[TABLE]

In [23], a discrete weighted directional Total Variation (dTV) regulariser is introduced as

[TABLE]

by projecting onto the complementary part of a vector field $\pazocal{P}_{{\bm{\xi}}_{n}}{\bm{x}}={\bm{x}}-\langle{\bm{\xi}}_{n},\,{\bm{x}}\rangle{\bm{\xi}}_{n}$ .

In [20], the continuous directional total variation ( $\mathrm{{DTV}}$ ) and directional total generalized variation ( $\mathrm{{DTGV}}$ ) are analysed for a single homogeneously fixed angle $\theta$ and for the minor semi-axis (now denoted with $a$ ) of the ellipse $E^{a,\theta}$ . There, $\mathrm{{DTV}}$ and $\mathrm{{DTGV}}$ are built upon test functions ${\bm{\Psi}}$ in the isotropic ball $B_{1}$ , so as to constrain $\widetilde{{\bm{\Psi}}}$ to the ellipsoid ball, i.e. $\widetilde{{\bm{\Psi}}}=\mathbf{R}_{\theta}{\bm{\Lambda}}_{a}{\bm{\Psi}}\in E^{a,\theta}$ and where $\mathbf{R}_{\theta}$ and ${\bm{\Lambda}}_{a}$ are rotation and contraction matrices, respectively, similarly to our setting explained later, see Equation Eq. 14:

[TABLE]

By comparing Eq. 12 with Eq. 1, we immediately note that our proposed setting deals with non-symmetric test functions and directional information inhomogenously varying in $\Omega$ , encoded in a weighted divergence term.

1.2 Our proposal

Our work extends the regularisers in Eq. 11-Eq. 12 for handling spatially varying directions ${\bm{\theta}}$ in $\Omega\subset\mathbb{R}^{2}$ instead of a fixed scalar direction $\theta$ . We investigate the directional total variation regulariser of Eq. 1 and study its performance for a variety of image processing problems by solving

[TABLE]

where $u^{\diamond}\in\mathrm{{L}}^{2}(\Omega)$ is a given, imperfect and possibly incomplete imaging data, and $\pazocal{S}:\mathrm{{L}}^{2}(\Omega)\to\mathrm{{L}}^{2}(\Omega)$ a linear operator. We consider the cases for which $\mathbf{M}_{j}$ in $\pazocal{M}_{q}=(\mathbf{M}_{j})_{j=1}^{q}$ from Eq. 1 is

[TABLE]

with $\mathbf{I}$ the identity, ${\bm{\Lambda}}_{\bm{b}}$ the contraction and $\mathbf{R}_{{\bm{\theta}}}$ the rotation matrices, defined as:

[TABLE]

and with ${\bm{b}}:=(b_{1}({\bm{x}}),b_{2}({\bm{x}}))^{\mathrm{T}}\in[0,1]^{2}$ , ${\bm{\theta}}:=\theta({\bm{x}})\in[0,2\pi)$ . Occasionally, we will use the vector field ${\bm{v}}=(\cos{\bm{\theta}},\sin{\bm{\theta}})^{\mathrm{T}}$ and its orthogonal ${\bm{v}}_{\perp}=(-\sin{\bm{\theta}},\cos{\bm{\theta}})^{\mathrm{T}}$ . Thus, we interpret the core operation of the dual version of the regulariser in Eq. 1, $\mathbf{M}_{1}\operatorname{\bm{\nabla}}\otimes u$ , as weighted directional derivatives of $u$ along ${\bm{v}}$ and ${\bm{v}}_{\perp}$ since

[TABLE]

We will in particular focus on the case ${\bm{b}}=(1,\beta({\bm{x}}))$ for $\beta({\bm{x}})\in[0,1]$ being either inhomogeneous or constant in $\Omega$ , see Remark 1.1 for the geometrical interpretation when different choices are made for the constant.

Remark 1.1.

In Fig. 2 we simulate the two dimensional behaviour of Eq. 15 for different choices of $b_{2}$ . More precisely, for a continuous imaging function $u:\Omega\to\mathbb{R}$ we represent a possible situation at the position ${\bm{x}}\in\Omega$ of the vectors ${\bm{p}}=(p_{1},p_{2})=\operatorname{\bm{\nabla}}u$ and ${\bm{v}}=(v_{1},v_{2})$ , depicted with red and blue arrows, respectively. We also represent the components ${\bm{r}}=(r_{1},r_{2})=(p_{1}v_{1},p_{2}v_{2})$ of $\operatorname{\bm{\nabla}}_{\bm{v}}u=r_{1}+r_{2}=p_{1}v_{1}+p_{2}v_{2}$ by a green arrow. The vectors and the corresponding arrows are the same in all Fig. 2(a) to Fig. 2(f). Moreover, the test functions ${\bm{\Psi}}=(\Psi^{1},\Psi^{2})$ lie on the black circle due to the constraint $\left\lVert{\bm{\Psi}}\right\rVert_{2}\leq 1$ . Note that in the 2D domain we have

[TABLE]

*which allows to change the metric space of the test functions into an elliptic ball in magenta. Being fixed $b_{1}=1$ , each figure corresponds to a particular choice of $b_{2}=\beta$ between 0 and 1. Finally, the magenta arrow corresponds to the direction of ${\bm{\Psi}}$ which realizes the supremum of the regulariser $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{1}$ in Equation Eq. 1. We observe in Fig. 2(f) the limit case ${\bm{b}}=(1,0)$ where $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{1}({u,\pazocal{M}})$ penalizes the rate of change of $u$ only along ${\bm{v}}$ without orthogonal ${\bm{v}}_{\perp}$ contribution. In all the other circumstances, ${\bm{v}}_{\perp}$ acts as quality estimation of ${\bm{v}}$ , leading to a full isotropic approach in the case ${\bm{b}}=(1,1)$ of Fig. 2(a), since the magenta arrow is bended in the direction of the gradient $\operatorname{\bm{\nabla}}u$ rather than the direction of $\operatorname{\bm{\nabla}}_{\bm{v}}u$ . *

1.3 Contribution of the paper

In what follows we will derive:

•

a rigorous definition of the total directional regulariser Eq. 1;

•

a characterisation of Eq. 1 that turns Eq. 13 into a form that is amenable for numerical solution. For this we propose a primal-dual algorithm and present certain instances for different combinations of orders $q=1,\dots,\mathrm{Q}$ , up to $\mathrm{Q}=3$ in Eq. 1;

•

a number of numerical experiments with this new regulariser for image denoising, image zooming and interpolation of two-dimensional surfaces from a sparse number of given height values.

1.4 Organization of the paper

In Section 2 we discuss the higher-order total directional variation regularisers with anisotropy. The numerical details of the discretisation are introduced in Section 3, with the primal-dual algorithm and the numerical optimisation described in Section 4. Imaging applications to denoising, wavelet-based zooming and surface interpolation, e.g. in atomic force microscopy imaging, are discussed in Section 5 and Section 6.

2 Higher-order total directional variation

In this section we introduce the rigorous definition of Eq. 1. To do so, we first introduce the terminology of tensors and their mathematical manipulation.

2.1 Tensors

Following [11], let $\pazocal{T}^{\ell}(\mathbb{R}^{d})$ be the vector space of $\ell$ -tensors defined as

[TABLE]

On $\pazocal{T}^{\ell}(\mathbb{R}^{d})$ , we have the following operations:

•

let $\otimes$ be the tensor product for ${\bm{\xi}}_{1}\in\pazocal{T}^{\ell_{1}}(\mathbb{R}^{d})$ , ${\bm{\xi}}_{2}\in\pazocal{T}^{\ell_{2}}(\mathbb{R}^{d})$ , with ${\bm{\xi}}_{1}\otimes{\bm{\xi}}_{2}\in\pazocal{T}^{\ell_{1}+\ell_{2}}(\mathbb{R}^{d})$ :

[TABLE]

•

let $\operatorname{trace}({\bm{\xi}})\in\pazocal{T}^{\ell-2}(\mathbb{R}^{d})$ be the trace of ${\bm{\xi}}\in\pazocal{T}^{\ell}(\mathbb{R}^{d})$ , with $\ell\geq 2$ , defined by

[TABLE]

where ${\bm{e}}_{i}$ is the $i$ -th standard basis vector;

•

let $(\,{\cdot}\,)^{\sim}$ be such that if ${\bm{\xi}}\in\pazocal{T}^{\ell}(\mathbb{R}^{d})$ , then ${\bm{\xi}}^{\sim}({\bm{a}}_{1},\dots\bm{a}_{\ell})={\bm{\xi}}({\bm{a}}_{\ell},{\bm{a}}_{1},\dots,{\bm{a}}_{\ell-1})$ ;

•

let $\overline{(\,{\cdot}\,)}$ be such that if ${\bm{\xi}}\in\pazocal{T}^{\ell}(\mathbb{R}^{d})$ , then $\overline{{\bm{\xi}}}({\bm{a}}_{1},\dots\bm{a}_{\ell})={\bm{\xi}}({\bm{a}}_{\ell},\dots,{\bm{a}}_{1});$

•

let ${\bm{\xi}},{\bm{\eta}}\in\pazocal{T}^{\ell}(\mathbb{R}^{d})$ . The space $\pazocal{T}^{\ell}(\mathbb{R}^{d})$ is equipped with the scalar product defined as

[TABLE]

We now introduce the derivative operator for tensors and its weighted version.

Definition 2.1.

Let $\operatorname{\bm{\nabla}}=(\partial_{1},\dots,\partial_{d})^{\mathrm{T}}$ be the derivative operator and ${\bm{\xi}}\in\pazocal{T}^{\ell}(\mathbb{R}^{d})$ . The derivative of ${\bm{\xi}}$ is defined as $(\operatorname{\bm{\nabla}}\otimes{\bm{\xi}})\in\pazocal{T}^{\ell+1}(\mathbb{R}^{d})$ via the following:

[TABLE]

Let ${\bm{\eta}}\in\pazocal{T}^{2}(\mathbb{R}^{d})$ . The derivative operator weighted by ${\bm{\eta}}$ is defined as ${\bm{\eta}}\operatorname{\bm{\nabla}}\in\pazocal{T}^{1}(\mathbb{R}^{d})$ and the derivative of ${\bm{\xi}}\in\pazocal{T}^{\ell}(\mathbb{R}^{d})$ weighted by ${\bm{\eta}}$ is defined as $({\bm{\eta}}\operatorname{\bm{\nabla}}\otimes{\bm{\xi}})\in\pazocal{T}^{\ell+1}(\mathbb{R}^{d})$ via the following:

[TABLE]

Remark 2.2.

*For notational purposes, the sum in Eq. 16 will be shortened using Einstein notation over the repeated subscript, meaning that each element of the tensor ${\bm{\eta}}\operatorname{\bm{\nabla}}\otimes{\bm{\xi}}$ is written as ${\bm{\eta}}_{j,k}\partial_{k}{\bm{\xi}}_{i_{1},\dots,i_{\ell}}$ . *

In what follows, we will also denote the space of $q$ -times uniformly continuously differentiable $\pazocal{T}^{\ell}(\mathbb{R}^{d})$ -valued tensors as $\mathrm{{C}}^{q}(\overline{\Omega},\pazocal{T}^{\ell}(\mathbb{R}^{d}))$ which is a Banach space with the norm

[TABLE]

where $(\operatorname{\bm{\nabla}}^{q}\otimes u):\Omega\to\pazocal{T}^{q+\ell}(\mathbb{R}^{d})$ , and we will consider also the space $\mathrm{{C}}_{c}^{q}(\Omega,\pazocal{T}^{\ell}(\mathbb{R}^{d}))$ of $\pazocal{T}^{\ell}(\mathbb{R}^{d})$ -valued tensors which are $q$ -times continuously differentiable with compact support in $\Omega$ .

2.2 Definition of total directional variation

For making sense of the distributional formulation of higher-order directional variation in Eq. 1 we need an integration by parts formula for the weighted derivative of tensors in Definition 2.1. Namely we consider

[TABLE]

with $\Omega\subset\mathbb{R}^{d}$ being a bounded Lipschitz domain, $\mathbf{M}\in\mathrm{{C}}^{1}(\Omega,\pazocal{T}^{2}(\mathbb{R}^{d}))$ , $\mathbf{A}\in\mathrm{{C}}^{1}(\Omega,\pazocal{T}^{\ell}(\mathbb{R}^{d}))$ and ${\bm{\Psi}}\in\mathrm{{C}}_{c}^{1}(\Omega,\pazocal{T}^{\ell+1}(\mathbb{R}^{d}))$ . We report in this section the main results from the second part of our companion work [42], where detailed proofs can be found. First, we give an integration by parts formula where only $\mathbf{M}$ switches:

Lemma 2.3.

Let $\Omega$ , $\mathbf{M}$ , $\mathbf{A}$ and ${\bm{\Psi}}$ as above. Then:

[TABLE]

Then a general adjoint property follows:

Lemma 2.4.

Let $\Omega$ , $\mathbf{M}$ , $\mathbf{A}$ and ${\bm{\Psi}}$ as above. Then:

[TABLE]

*where $\operatorname{div}_{\mathbf{M}}{\bm{\Psi}}:=\operatorname{trace}\left(\operatorname{\bm{\nabla}}\otimes\left[\operatorname{trace}\left(\mathbf{M}\otimes{\bm{\Psi}}^{\sim}\right)\right]^{\sim}\right).$ *

We can now define the total directional variation of order $q$ with weights ${\bm{\alpha}}\in\mathbb{R}_{+}^{q}$ .

Definition 2.5.

Let $\Omega\subset\mathbb{R}^{d}$ , $u\in\mathrm{{L}}^{1}(\Omega,\mathbb{R})$ , $q\in\mathbb{N}$ , $\pazocal{M}:=(\mathbf{M}_{j})_{j=1}^{q}$ be a collection of fields in $\mathrm{{C}}^{\infty}(\Omega,\pazocal{T}^{2}(\mathbb{R}^{d}))$ and ${\bm{\alpha}}:=(\alpha_{0},\dots,\alpha_{q-1})$ be a positive weight vector. Then, the total directional variation of order $q$ , associated to $\pazocal{M}$ and ${\bm{\alpha}}$ , is defined as:

[TABLE]

where

[TABLE]

and the weighted divergence of order $q$ is defined recursively, from Lemma 2.4, as:

[TABLE]

Remark 2.6.

*For $\pazocal{M}=(\mathbf{I})_{j=1}^{q}$ , then $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q}({u,\pazocal{M}})\equiv\neg\mathrm{sym}\mathrm{{TGV}}_{{\bm{\alpha}}}^{q}(u)$ , see [42] and [11, Remark 3.10]. *

2.3 Directional matrices for applications

In what follows, we introduce a particular parametrisation of directional matrices for fields $\pazocal{M}$ in Eq. 19. For standard imaging applications, we will usually deal with grey-scale images $u:\Omega\to\mathbb{R}$ , $\Omega\subset\mathbb{R}^{2}$ , i.e. $d=2$ .

Definition 2.7 (Directional matrices).

Let $\left({\bm{b}}^{j}\right)_{j=1}^{q}$ , ${\bm{b}}^{j}:\Omega\to[0,1]^{2}$ , be a collection of so-called contraction weights (being each element of modulus $\leq 1$ ), $({\bm{\theta}}^{j})_{j=1}^{q}$ , ${\bm{\theta}}^{j}:\Omega\to[0,2\pi)$ , be a collection of angles, and ${\bm{\Lambda}}_{{\bm{b}}}^{j}$ and $\mathbf{R}_{{\bm{\theta}}}^{j}$ the associated contraction and rotation matrices defined, respectively, as

[TABLE]

Then we define $\pazocal{M}:=(\mathbf{M}_{j})_{j=1}^{q}$ to be a collection of contraction-rotation matrices (in Einstein notation) as

[TABLE]

*where $\lambda_{pk}^{j},r_{k\ell}^{j}$ are the element-wise entries of the matrices ${\bm{\Lambda}}_{\bm{b}}^{j}$ , $\mathbf{R}_{\bm{\theta}}^{j}$ , respectively. *

Definition 2.8 (Weighted derivatives of order 1).

Let $\operatorname{\bm{\nabla}}=(\partial_{1},\partial_{2})^{\mathrm{T}}$ be the derivative operator. The gradient of a differentiable imaging function $u$ is given by $\operatorname{\bm{\nabla}}u:=\operatorname{\bm{\nabla}}\otimes u=\left(\partial_{1}u,\partial_{2}u\right)^{\mathrm{T}}$ and the weighted derivative operator of order 1 associated to the directional matrix $\mathbf{M}_{1}$ from Definition 2.7 is

[TABLE]

Remark 2.9.

*If $\mathbf{M}_{1}=\mathbf{I}$ (i.e. $b_{1}^{1}\equiv b_{2}^{1}\equiv 1$ and ${\bm{\theta}}^{q}\equiv 0$ for all ${\bm{x}}\in\Omega$ ), then $\mathbf{M}_{1}\operatorname{\bm{\nabla}}u\equiv\operatorname{\bm{\nabla}}u$ . *

Remark 2.10.

Given ${\bm{\theta}}_{1}$ , let ${\bm{v}}^{1}=(\cos{\bm{\theta}}^{1},\sin{\bm{\theta}}^{1})$ and ${\bm{v}}_{\perp}^{1}=(-\sin{\bm{\theta}}^{1},\cos{\bm{\theta}}^{1})$ . Then

[TABLE]

where $\operatorname{\bm{\nabla}}_{\bm{z}}u$ represents the directional derivative along a vector field ${\bm{z}}$ , defined as

[TABLE]

Definition 2.11 (Weighted derivatives of order $q$ ).

We define the derivative of order $q$ of $u$ using Definition 2.8 recursively as

[TABLE]

We define the weighted derivative of order $q$ of $u$ with respect to $\pazocal{M}$ recursively as

[TABLE]

2.4 Examples

We present some examples of the total directional variation of order $q$ for $q=1,2,3$ , ${\bm{\alpha}}=(\alpha_{j})_{j=0}^{q-1}$ and a collection of directional matrices $\pazocal{M}=(\mathbf{M}_{j})_{j=1}^{q}$ :

•

order $q=1$ and $\pazocal{M}=(\mathbf{M}_{1})$ :

[TABLE]

•

order $q=2$ and $\pazocal{M}=(\mathbf{M}_{1},\mathbf{M}_{2})$ :

[TABLE]

•

order $q=3$ and $\pazocal{M}=(\mathbf{M}_{1},\mathbf{M}_{2},\mathbf{M}_{3})$ :

[TABLE]

3 Numerical discretisation

The rest of the paper focuses on the discretised formulation of Eq. 13, and its numerical solutions and performances on a number of image processing variational examples. We start by discretising the $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q}-\mathrm{{L}}^{2}$ problem in Eq. 13.

3.1 Staggered grids

The discretisation of the $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q}-\mathrm{{L}}^{2}$ in Eq. 13 is based on finite-difference schemes for derivatives on staggered regular Cartesian grids of width $h>0$ :

•

the grid of pixels $\Omega^{h}$ , of axes $x_{1}$ and $x_{2}$ for a 2-dimensional domain and size $M\times N$ , is defined as

[TABLE]

•

the grid of cell centres $\Gamma^{h}$ , of size $(M-1)\times(N-1)$ and used to perform the weighted derivative operation (i.e. for introducing the anisotropy, see the grid associated to the blue squares in Fig. 3), is defined as:

[TABLE]

•

the collection of grids $\left(\mathrm{X}^{j,h}_{{\bm{\iota}}}\right)_{j=1}^{q}$ associated to the differential operators involved, where ${\bm{\iota}}=(\iota_{1},\dots,\iota_{j})$ is a multi-index variable and $\iota_{s}\in\{1,2\}$ for each $s=1,\dots,j$ indicates the partial derivative involved ( $1$ for $\partial_{x_{1}}$ and $2$ for $\partial_{x_{2}}$ ). Every $\mathrm{X}^{j,h}_{{\bm{\iota}}}$ is a sub-collection of $2^{j}$ grids, each one of size $M\times N$ and denoted by $\mathrm{X}^{j,h}_{(\iota_{1},\dots,\iota_{j})}$ , each one associated to a fixed choice for the derivative operator $\left(\partial_{x_{\iota_{j}}}\cdots\partial_{x_{\iota_{1}}}\right)$ considered:

[TABLE]

where $\left\lvert I_{1}\right\rvert_{\#}$ and $\left\lvert I_{2}\right\rvert_{\#}$ are the cardinality of the sets $I_{1}$ and $I_{2}$ containing as many elements as the number of derivatives along the axes $x_{1}$ and $x_{2}$ , respectively. A visual representation of such grids is given in Fig. 3. For example:

–

with a bit of abuse of notation, if $j=0$ then $\mathrm{X}^{0,h}_{(-)}$ coincides with $\Omega^{h}$ ;

–

$\mathrm{X}^{1,h}_{(1)},\mathrm{X}^{1,h}_{(2)}$ result in $\Omega^{h}$ shifted by $h/2$ along $x_{1}$ and $x_{2}$ axes, respectively;

–

if $q=3$ and ${\bm{\iota}}=(2,1,1)$ , then we are referring to the grid associated to one out of the eight possible combinations for the third order derivative $\operatorname{\bm{\nabla}}^{3}$ , namely $\partial_{x_{2}}\partial_{x_{1}}\partial_{x_{1}}$ , which is located on the grid identified by our notation $\mathrm{X}_{(1,1,2)}^{3,h}$ .

3.2 Discretised objects

Let the order of derivatives $q>0$ be fixed. By means of the superscript $h$ , we define the finite-dimensional approximation of the following quantities, where $|\Omega^{h}|$ , $|\Gamma^{h}|$ and $|\mathrm{X}^{j,h}_{(\iota_{1},\dots,\iota_{j})}|$ are the number of grid points in $\Omega^{h}$ , $\Gamma^{h}$ and $\mathrm{X}^{j,h}_{(\iota_{1},\dots,\iota_{j})}$ , respectively:

•

${\bm{u}}^{h}\in\mathbb{R}^{|\Omega^{h}|}$ is the discretisation of the function $u$ ;

•

${\bm{u}}^{\diamond,h}$ is the discretisation of the observed imaging data $u^{\diamond}$ ;

•

${\bm{v}}^{h}=({\bm{v}}_{1}^{h},{\bm{v}}_{2}^{h})\in\mathbb{R}^{|\Gamma^{h}|\times|\Gamma^{h}|}$ is a discrete vector field;

•

${\bm{b}}^{h}=({\bm{b}}_{1}^{h},{\bm{b}}_{2}^{h})\in\mathbb{R}^{|\Gamma^{h}|\times|\Gamma^{h}|}$ are discrete contraction weights for $\Lambda_{\bm{b}}$ ;

•

$\mathbf{M}_{j}^{h}\in\mathbb{R}^{|\Gamma^{h}|\times|\Gamma^{h}|\times|\Gamma^{h}|\times|\Gamma^{h}|}$ discretises the weights $\mathbf{M}_{j}\in\pazocal{T}^{2}(\mathbb{R}^{2})$ , for each $j=1,\dots,q$ ;

•

$\pazocal{M}^{h}=(\mathbf{M}_{j}^{h})_{j=1}^{q}$ discretises the collection of weights $\pazocal{M}=(\mathbf{M}_{j})_{j=1}^{q}$ ;

•

${\bm{\Psi}}^{h}=({\bm{\Psi}}_{1}^{h},\dots,{\bm{\Psi}}_{2^{q}}^{h})\in\mathbb{R}^{|\Gamma^{h}|\times\dots\times|\Gamma^{h}|}$ discretises the test functions ${\bm{\Psi}}\in\pazocal{T}^{q}(\mathbb{R}^{2})$ ;

•

${\bm{z}}^{h}=({\bm{z}}_{j}^{h})_{j=0}^{q-1}$ discretises the primal variables ${\bm{z}}$ , with ${\bm{z}}_{0}^{h}={\bm{u}}^{h}\in\mathbb{R}^{|\Omega^{h}|}$ , ${\bm{z}}_{0}^{\diamond,h}={\bm{u}}^{\diamond,h}$ and each ${\bm{z}}_{j}^{h}\in\mathbb{R}^{|\mathrm{X}_{(1,\dots,1)}^{j,h}|\times\dots\times|\mathrm{X}_{(2,\dots,2)}^{j,h}|}$ , for $j=1,\dots,q-1$ ;

•

${\bm{w}}^{h}=({\bm{w}}_{j}^{h})_{j=1}^{q}$ discretises the dual variables ${\bm{w}}$ , with ${\bm{w}}_{j}^{h}\in\mathbb{R}^{|\mathrm{X}_{(1,\dots,1)}^{j,h}|\times\dots\times|\mathrm{X}_{(2,\dots,2)}^{j,h}|}$ for $j=1,\dots,q$ .

3.3 Isotropic differential operators

Here we discuss the discretization of the adjoint unweighted operators $\operatorname{\bm{\nabla}}$ and $\operatorname{div}$ . For ${\bm{u}}^{h}\in\mathbb{R}^{|\Omega^{h}|}$ , the discrete gradient operator is defined as

[TABLE]

where we use the central second-order finite difference scheme on the grids $\mathrm{X}_{(1)}^{1,h},\,\mathrm{X}_{(2)}^{1,h}$ :

[TABLE]

Let ${\bm{p}}^{h}=({\bm{p}}_{1}^{h},{\bm{p}}_{2}^{h})\in\mathbb{R}^{|\mathrm{X}_{(1)}^{1,h}|\times|\mathrm{X}_{(2)}^{1,h}|}$ and let the discrete divergence operator

[TABLE]

be defined for each pixel $(k,l)$ via the central second-order difference scheme on $\Omega^{h}$ :

[TABLE]

Thus, the isotropic discrete gradient and discrete divergence are designed to fulfil the discrete adjointness property, for every ${\bm{u}}^{h}\in\mathbb{R}^{|\Omega^{h}|}$ and ${\bm{p}}^{h}\in\mathbb{R}^{|\mathrm{X}_{(1)}^{1,h}|\times|\mathrm{X}_{(2)}^{1,h}|}$ :

[TABLE]

where $\langle\,{\cdot}\,,\,\,{\cdot}\,\rangle_{\Gamma^{h}}:\mathbb{R}^{|\mathrm{X}_{(1)}^{1,h}|\times|\mathrm{X}_{(2)}^{1,h}|}\times\mathbb{R}^{|\mathrm{X}_{(1)}^{1,h}|\times|\mathrm{X}_{(2)}^{1,h}|}\to\mathbb{R}$ and $\langle\,{\cdot}\,,\,\,{\cdot}\,\rangle_{\Omega^{h}}:\mathbb{R}^{|\Omega^{h}|}\times\mathbb{R}^{|\Omega^{h}|}\to\mathbb{R}$ .

For higher-order derivatives of order $q$ we denote the isotropic discrete gradient and discrete divergence operator by $\operatorname{\bm{\nabla}}^{q,h}$ and $\operatorname{div}^{q,h}$ and write

[TABLE]

and

[TABLE]

The adjointness property is fulfilled for every ${\bm{u}}^{h}\in\mathbb{R}^{|\Omega^{h}|}$ and ${\bm{p}}^{h}\in\mathbb{R}^{|\mathrm{X}_{(1,\dots,1)}^{q,h}|\times\dots\times|\mathrm{X}_{(2,\dots,2)}^{q,h}|}$ :

[TABLE]

with $\langle\,{\cdot}\,,\,{\cdot}\,\rangle_{\Gamma^{h}}:\mathbb{R}^{|\mathrm{X}_{(1,\dots,1)}^{q,h}|\times\dots\times|\mathrm{X}_{(2,\dots,2)}^{q,h}|}\times\mathbb{R}^{|\mathrm{X}_{1,\dots,1}^{q,h}|\times\dots\times|\mathrm{X}_{(2,\dots,2)}^{q,h}|}\to\mathbb{R}$ and $\langle\,{\cdot}\,,\,{\cdot}\,\rangle_{\Omega^{h}}:\mathbb{R}^{|\Omega^{h}|}\times\mathbb{R}^{|\Omega^{h}|}\to\mathbb{R}$ .

3.4 Transfer operators

The offset in the location between ${\bm{z}}_{0}^{h},\,\operatorname{\bm{\nabla}}^{1,h}{\bm{z}}_{0}^{h},\,\dots,\,\operatorname{\bm{\nabla}}^{q,h}{\bm{z}}_{0}^{h}$ and the fields $\pazocal{M}^{h}$ associated to ${\bm{v}}^{h}$ requires the introduction of transfer operators, a concept from multigrid methods [52], so as to make the quantities computable in the same location. In what follows, we will provide some insights for the general case.

Let $\pazocal{W}:=(\pazocal{W}^{j})_{j=1}^{q}$ be a family of transfer operators $\pazocal{W}^{j}=(\mathbf{W}_{\bm{\iota}}^{j})$ , with $\mathbf{W}_{\bm{\iota}}^{j}:\mathbb{R}^{|\mathrm{X}_{\bm{\iota}}^{j,h}|}\to\mathbb{R}^{|\Gamma^{h}|}$ and ${\bm{\iota}}$ a multi-index variable, with entries in $\{1,2\}$ similarly as for the staggered grids $\mathrm{X}_{{\bm{\iota}}}^{j,h}$ . The idea is that $\pazocal{W}^{j}$ interpolates the data from the grids of $j$ -th order derivatives $\mathrm{X}_{\bm{\iota}}^{j,h}$ to the grid of cell centres $\Gamma^{h}$ , e.g. $\mathbf{W}^{j}_{\bm{\iota}}$ is the operator made by partition of unit weights. Since it is an averaging matrix, its adjoint operation is denoted by $(\pazocal{W}^{j})^{\mathrm{T}}$ , where the extension from $\Gamma^{h}$ to the boundary of $\mathrm{X}_{\bm{\iota}}^{j,h}$ is made possible by mirroring the data as appropriate.

Example 3.1.

For ${\bm{z}}_{0}^{h}\in\mathbb{R}^{|\Omega^{h}|}$ and $q$ fixed, the derivatives of ${\bm{z}}_{0}^{h}$ (up to order $q$ ) are

[TABLE]

the transfer operators $\pazocal{W}=(\pazocal{W}^{j})_{j=1}^{q}$ are

[TABLE]

and each $\mathbf{W}^{j}_{(\iota_{1},\dots,\iota_{j})}$ is the interpolation matrix that interpolates the values of $(\partial_{\iota_{j}}\dots\partial_{\iota_{1}}{\bm{z}}_{0}^{h})\in\mathbb{R}^{|\mathrm{X}_{(\iota_{1},\dots,\iota_{j})}^{j,h}|}$ to $\mathbb{R}^{|\Gamma^{h}|}$ by an arithmetic mean. For example, for the first order derivatives we have

[TABLE]

where, for $k=1,\dots,M-1$ and $l=1,\dots N-1$ ,

[TABLE]

As a further example for the second order derivative case and $\mathbf{M}_{1}^{h}=\mathbf{I}$ (which implies no averaging on the first derivatives for our construction), we have

[TABLE]

where $\pazocal{W}^{2}\operatorname{\bm{\nabla}}^{1,h}\otimes\operatorname{\bm{\nabla}}^{1,h}\otimes{\bm{u}}^{h}\in\mathbb{R}^{|\Gamma^{h}|\times|\Gamma^{h}|\times|\Gamma^{h}|\times|\Gamma^{h}|}$ and, for $k=1,\dots,M-1$ and $l=1,\dots N-1$ ,

[TABLE]

*with $(\partial_{1}^{h}\partial_{1}^{h}{\bm{u}}^{h})_{0,l}=(\partial_{1}^{h}\partial_{1}^{h}{\bm{u}}^{h})_{1,l}$ and $(\partial_{2}^{h}\partial_{2}^{h}{\bm{u}}^{h})_{k,0}=(\partial_{2}^{h}\partial_{2}^{h}{\bm{u}}^{h})_{k,1}$ . *

Remark 3.2.

*The choice of the staggered grid increases the accuracy of the solution and allows to compute the inner products between gradients and the vector fields onto a unique regular Cartesian grid of reference, avoiding offsets. Moreover, the transfer operators $\pazocal{W}$ reduce the bandwidth of higher order finite difference matrices, improving the quality of the result and reducing the smoothing due to large stencils. *

Remark 3.3.

*Note that when $\mathbf{M}_{j}^{h}=\mathbf{I}$ and $\alpha_{j}=+\infty$ for every $j<q$ , as in the applications described in Section 6 of this paper, the use of transfer operators is needed only for the outer derivative, i.e. the one associated to the weighting field $\mathbf{M}_{q}^{h}$ . *

We report in Fig. 3 the positions of $\operatorname{\bm{\nabla}}^{q,h}$ , up to order $q=3$ , in order to illustrate how transfer operators $\pazocal{W}$ work in interpolating the data on $\Gamma^{h}$ .

3.5 Anisotropic differential operators

By construction, $\mathbf{M}_{1}^{h}\in\mathbb{R}^{|\Gamma^{h}|\times|\Gamma^{h}|\times|\Gamma^{h}|\times|\Gamma^{h}|}$ and $\operatorname{\bm{\nabla}}^{1,h}{\bm{u}}^{h}\in\mathbb{R}^{|\mathrm{X}_{(1)}^{1,h}|\times|\mathrm{X}_{(2)}^{1,h}|}$ so the grids $\mathrm{X}_{(1)}^{1,h},\mathrm{X}_{(2)}^{1,h}$ have an ( $h/2$ )-offset with respect to $\Gamma^{h}$ . In this case, locations of $\operatorname{\bm{\nabla}}^{1,h}{\bm{u}}^{h}$ and $\mathbf{M}_{1}^{h}$ are matched via the transfer operators $\pazocal{W}^{1}=(\mathbf{W}_{(1)}^{1},\mathbf{W}_{(2)}^{1})$ .

From Remark 2.10, $\mathbf{M}_{1}\operatorname{\bm{\nabla}}\otimes u$ can be discretised in the correct grid position by the operator

[TABLE]

and the discretisation reads as

[TABLE]

Therefore, the discrete weighted divergence $\operatorname{div}_{\mathbf{M}_{1}^{h},\pazocal{W}^{1}}^{h}{\bm{p}}^{h}:\mathbb{R}^{|\Gamma^{h}|\times|\Gamma^{h}|}\to\mathbb{R}^{|\Omega^{h}|}$ is

[TABLE]

This leads to the discrete adjointness property, for every ${\bm{u}}^{h}\in\mathbb{R}^{|\Omega^{h}|},{\bm{p}}^{h}\in\mathbb{R}^{|\Gamma^{h}|\times|\Gamma^{h}|}$ :

[TABLE]

where $\langle\,{\cdot}\,,\,\,{\cdot}\,\rangle_{\Gamma^{h}}:\mathbb{R}^{|\Gamma^{h}|\times|\Gamma^{h}|}\times\mathbb{R}^{|\Gamma^{h}|\times|\Gamma^{h}|}\to\mathbb{R}$ and $\langle\,{\cdot}\,,\,\,{\cdot}\,\rangle_{\Omega^{h}}:\mathbb{R}^{|\Omega^{h}|}\times\mathbb{R}^{|\Omega^{h}|}\to\mathbb{R}$ .

When considering higher order derivatives for a generic order $q$ , the adjoint formula Eq. 25 is slightly more complicated due to the recursive definition of the weighted gradient and the location of the nested multiplication. Indeed by Definition 2.11 we have for a fixed $q$ $\operatorname{\bm{\nabla}}^{q}_{\pazocal{M}}u:=\mathbf{M}_{q}\operatorname{\bm{\nabla}}\otimes\dots\otimes\mathbf{M}_{1}\operatorname{\bm{\nabla}}\otimes u$ , whose finite-dimensional approximation is formally written as $\operatorname{\bm{\nabla}}^{q,h}_{\pazocal{M}^{h},\pazocal{W}}{\bm{u}}^{h}\,:\,\mathbb{R}^{|\Omega^{h}|}\to\mathbb{R}^{|\Gamma^{h}|\times\dots\times|\Gamma^{h}|}$ via the recursion rule

[TABLE]

The finite-dimensional approximation of the adjoint $\operatorname{div}_{\pazocal{M}}^{q}$ is denoted with $\operatorname{div}_{\pazocal{M}^{h},\pazocal{W}}^{q,h}\,:\,\mathbb{R}^{|\Gamma^{h}|\times\dots\times|\Gamma^{h}|}\to\mathbb{R}^{|\Omega^{h}|}$ and defined via the recursion rule

[TABLE]

Remark 3.4.

*Note that in Eq. 26 for $j=q$ we omitted the inverse transfer operator $(\pazocal{W}^{q})^{\mathrm{T}}$ so as to force the highest derivative $\operatorname{\bm{\nabla}}^{q,h}_{\pazocal{M}^{h},\pazocal{W}}{\bm{u}}^{h}$ to be located in $\mathbb{R}^{|\Gamma^{h}|\times\dots\times|\Gamma^{h}|}$ and match with the position of ${\bm{\Psi}}^{h}$ . For the same reason, in Eq. 27 we omitted the transfer operator for $j=q$ so as to match the quantity with ${\bm{u}}^{h}$ in the grid $\Omega$ . This operation is performed in view of the adjointness property stated below in Eq. 28. *

For every ${\bm{u}}^{h}\in\mathbb{R}^{|\Omega^{h}|},{\bm{p}}^{h}\in\mathbb{R}^{|\Gamma^{h}|\times\dots\times|\Gamma^{h}|}$ , the discrete adjointness property holds:

[TABLE]

where $\langle\,{\cdot}\,,\,\,{\cdot}\,\rangle_{\Gamma^{h}}:\mathbb{R}^{|\Gamma^{h}|\times\dots\times|\Gamma^{h}|}\times\mathbb{R}^{|\Gamma^{h}|\times\dots\times|\Gamma^{h}|}\to\mathbb{R}$ and $\langle\,{\cdot}\,,\,\,{\cdot}\,\rangle_{\Omega^{h}}:\mathbb{R}^{|\Omega^{h}|}\times\mathbb{R}^{|\Omega^{h}|}\to\mathbb{R}$ .

4 Numerical optimisation

In what follows, we solve in the first instance the single line version of Eq. 13, namely for a fixed $q>0$ , a fixed ${\bm{\alpha}}=(\alpha_{j})_{j=0}^{q-1}$ , a fixed collection of weighting matrices $\pazocal{M}=(\mathbf{M}_{j})_{j=1}^{q}$ and $\pazocal{S}$ the operator associated to the problem to solve, we aim to tackle the problem

[TABLE]

by means of a primal-dual hybrid gradient method [15, 16] and following [11, Equation 4.4]. With all discrete objects in place, we have

[TABLE]

where ${\bm{z}}_{0}^{\diamond,h}={\bm{u}}^{\diamond,h}\in\mathbb{R}^{|\Omega^{h}|}$ , $\operatorname{div}_{\pazocal{M}^{h},\pazocal{W}}^{q,h}$ is the discretized weighted divergence w.r.t. the weights $\pazocal{M}^{h}$ and the transfer operators $\pazocal{W}$ , $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q,h}$ is the discrete version of $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q}$ defined as

[TABLE]

and $\pazocal{Y}_{\pazocal{M}^{h},{\bm{\alpha}}}^{q,h}$ is the discretization of $\pazocal{Y}_{\pazocal{M},{\bm{\alpha}}}^{q}$ in Eq. 20, defined as

[TABLE]

4.1 Discrete characterisation of TDV

For a fixed $q>0$ , the regulariser $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q,h}$ can be characterised as follows. From the discrete version of $\mathrm{{TGV}}_{{\bm{\alpha}}}^{q}$ in [50, Section 4.1] and following the characterization of $\mathrm{{TGV}}_{{\bm{\alpha}}}^{q}$ in [11, Remark 3.8 and Remark 3.10], we can write the equivalent discrete definition of $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q}({u,\pazocal{M}})$ for ${\bm{u}}^{h}\in\mathbb{R}^{|\Omega^{h}|}$ and $\pazocal{M}^{h}=(\pazocal{M}_{j}^{h})_{j=1}^{q}$ as

[TABLE]

where

[TABLE]

Indeed, in the following let $j=1,...,q$ , ${\bm{w}}_{q-j+1}^{h}\in\mathbb{R}^{|\mathrm{X}_{\bm{\iota}}^{q-j+1}|}$ and ${\bm{z}}_{q-j}^{h}\in\mathbb{R}^{|\mathrm{X}_{\bm{\iota}}^{q-j,h}|}$ . We call

[TABLE]

where $\operatorname{div}_{\pazocal{M}^{h},\pazocal{W}}^{j-1}$ as in Eq. 27 and $\pazocal{Y}_{\pazocal{M}^{h},{\bm{\alpha}}}^{q,h}$ as in Eq. 30. Note that the sup is finite by definition of $\pazocal{Y}_{\pazocal{M}^{h},{\bm{\alpha}}}^{q,h}$ . Thus $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q,h}({{\bm{u}}^{h},\pazocal{M}^{h}})=\mathrm{{DV}}_{{{\bm{\alpha}}}}^{q,h}(\operatorname{\bm{\nabla}}^{h}{\bm{u}}^{h},\pazocal{M}^{h})$ and we define

[TABLE]

With ${\bm{s}}_{q,j}^{h}=\mathbf{M}_{q}^{h}\pazocal{W}^{q}\operatorname{\bm{\nabla}}^{h}\otimes\dots\otimes(\pazocal{W}^{q-j+2})^{\mathrm{T}}\mathbf{M}_{q-j+2}^{h}\pazocal{W}^{q-j+2}\operatorname{\bm{\nabla}}^{h}$ and $(j-1)$ -times integration by parts, the functional becomes

[TABLE]

where $\delta_{K_{\ell}^{h}}$ is the characteristic function with values either [math] or $+\infty$ , and where the adjoint $\ast$ is the same as in [11, Remark 3.9]. By Fenchel duality for the operator $\operatorname{div}_{\pazocal{M}^{h},\pazocal{W}}^{j-1,h}$ we have:

[TABLE]

Iterating the procedure for $j=q,\dots,2$ and by the identity

[TABLE]

we get

[TABLE]

and thus, with $\pazocal{K}^{h}_{j,j}$ as in Eq. 32, we conclude

[TABLE]

A continuous version of Eq. 31 also holds. This is proved in the second part of this work [42].

The characterisation of $\mathrm{{TDV}}_{{{\bm{\alpha}}_{q}}}^{q,h}$ in Eq. 31 is fundamental for writing a suitable primal-dual algorithm for the minimization problem in Eq. 13.

4.2 Discretised single minimization problems

Let ${\bm{u}}^{\diamond,h}$ be a given discrete imaging data. For a fixed order $q>0$ , let $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q,h}({\bm{u}}^{h},\pazocal{M}^{h})$ be decomposed as in Eq. 31, with $\pazocal{M}^{h}=(\mathbf{M}_{j}^{h})_{j=1}^{q}$ , ${\bm{\alpha}}=(\alpha_{j})_{j=0}^{q-1}$ and the $\left\lVert\,{\cdot}\,\right\rVert_{1}$ denoted in the discrete setting by $\left\lVert\mathbf{Z}\right\rVert_{2,1}=\sum_{i,j}\sqrt{\sum_{k=1}^{s}{(\mathbf{Z}_{k})_{i,j}^{2}}}$ for a generic tensor-valued object $\mathbf{Z}=(\mathbf{Z}_{k})_{k=1}^{s}$ , with each $\mathbf{Z}_{k}\in\pazocal{T}^{2}(\mathbb{R}^{M\times N})$ .

The discrete single minimization problems, for ${\bm{z}}^{h}$ defined as in Section 3.2 read as:

•

for order $q=1$ , $\pazocal{M}^{h}=(\mathbf{M}_{1}^{h})$ , ${\bm{\alpha}}=\alpha_{0}$ , ${\bm{z}}^{h}=({\bm{z}}_{0}^{h})$ :

[TABLE]

•

for order $q=2$ , $\pazocal{M}_{2}^{h}=(\mathbf{M}_{1}^{h},\mathbf{M}_{2}^{h})$ , ${\bm{\alpha}}=(\alpha_{0},\alpha_{1})$ , ${\bm{z}}^{h}=({\bm{z}}_{0}^{h},{\bm{z}}_{1}^{h})$ :

[TABLE]

•

for order $q=3$ , $\pazocal{M}_{3}^{h}=(\mathbf{M}_{1}^{h},\mathbf{M}_{2}^{h},\mathbf{M}_{3}^{h})$ , ${\bm{\alpha}}_{3}=(\alpha_{0},\alpha_{1},\alpha_{2})$ , ${\bm{z}}^{h}=({\bm{z}}_{0}^{h},{\bm{z}}_{1}^{h},{\bm{z}}_{2}^{h})$ :

[TABLE]

For a fixed $q>0$ , we aim to provide a more concise formulation of Eqs. 33, 34 and 35. Let ${\bm{z}}^{h}$ be as above and $\pazocal{K}^{h}=(\pazocal{K}_{j,\ell}^{h})_{j,\ell=1}^{q}$ be a matrix of operators associated to $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q,h}$ and defined as

[TABLE]

i.e. with $\pazocal{K}^{h}_{j,\ell}=\bm{0}$ if $\ell\neq\{j,j+1\}$ , $\pazocal{K}^{h}_{j,j+1}=-\mathbf{I}$ and as $\pazocal{K}^{h}_{j,j}$ in Eq. 32 for each $j=1,\dots,q$ . Then, solving Eq. 29 is equivalent to solving for ${\bm{z}}^{h}=({\bm{z}}_{0}^{h},\dots,{\bm{z}}_{q-1}^{h})$ , with ${\bm{z}}_{0}^{h}={\bm{u}}^{h}$ , the problem:

[TABLE]

By duality of the $\left\lVert\,{\cdot}\,\right\rVert_{2,1}$ norm and recalling that ${\bm{w}}^{h}$ is the dual vector defined in Section 3.2, we rewrite Eq. 37 into a saddle-point minimization problem:

[TABLE]

or, in short notation:

[TABLE]

where $G({\bm{z}}^{h})$ is a partially strongly convex term, since it can be seen as $G({\bm{z}}^{h})=G_{0}(\pazocal{P}{\bm{z}}^{h})$ for the projection $\pazocal{P}$ of ${\bm{z}}^{h}$ onto the subspace of ${\bm{z}}_{0}^{h}$ and with $G_{0}$ being strongly convex, and where

[TABLE]

and

[TABLE]

and $\pazocal{P}{\bm{z}}^{h}=\left(\pazocal{S}{\bm{z}}_{0}^{h},\bm{0},\cdots,\bm{0}\right)^{\mathrm{T}}$ .

4.3 Proximal operators

We aim to solve the saddle point problem Eq. 39 with a Primal-Dual Hybrid Gradient (PDHG) algorithm. We need the proximal operators of $F^{\ast}$ and $G$ .

The proximal map of $F^{\ast}$ evaluated at a point $\savestack{\tmpbox}{\stretchto{\scaleto{\scalerel*[width("{\bm{w}}")]{\kern-0.6pt\bigwedge\kern-0.6pt}{\rule[-505.89pt]{4.30554pt}{505.89pt}}}{}}{0.5ex}}\stackon[1pt]{{\bm{w}}}{\tmpbox}^{h}$ is the sum of the projections onto the respective polar balls since $F^{\ast}$ is fully separable111meaning that $F^{\ast}$ is a function that can be written as a sum of functions in disjoint sets of variables.,

[TABLE]

The proximal map of $G$ should be evaluated at a point $\savestack{\tmpbox}{\stretchto{\scaleto{\scalerel*[width("{\bm{z}}")]{\kern-0.6pt\bigwedge\kern-0.6pt}{\rule[-505.89pt]{4.30554pt}{505.89pt}}}{}}{0.5ex}}\stackon[1pt]{{\bm{z}}}{\tmpbox}^{h}$ . Recalling that ${\bm{z}}_{0}^{h}={\bm{u}}^{h}$ and that ${\bm{z}}$ is as in Section 3.2, we have:

[TABLE]

Let us focus on the first component of $\operatorname{\mathbf{prox}}_{\tau G}$ , in Eq. 40: we have to solve

[TABLE]

whose minimum is achieved by a ${\bm{z}}_{0}^{h}$ that solves, for $\pazocal{S}^{\ast}$ adjoint of $\pazocal{S}$ ,

[TABLE]

Thus, the first component of $\operatorname{\mathbf{prox}}_{\tau G}(\savestack{\tmpbox}{\stretchto{\scaleto{\scalerel*[width("{\bm{z}}")]{\kern-0.6pt\bigwedge\kern-0.6pt}{\rule[-505.89pt]{4.30554pt}{505.89pt}}}{}}{0.5ex}}\stackon[1pt]{{\bm{z}}}{\tmpbox}^{h})$ , that is $\operatorname{\mathbf{prox}}_{\tau G_{0}}(\pazocal{P}\savestack{\tmpbox}{\stretchto{\scaleto{\scalerel*[width("{\bm{z}}")]{\kern-0.6pt\bigwedge\kern-0.6pt}{\rule[-505.89pt]{4.30554pt}{505.89pt}}}{}}{0.5ex}}\stackon[1pt]{{\bm{z}}}{\tmpbox}^{h})$ , is

[TABLE]

Note that for the Rudin-Osher-Fatemi denoising problem we have $\pazocal{S}=\mathbf{I}$ , thus $\pazocal{S}^{\ast}=\mathbf{I}$ , and the proximal map agrees with the one computed in [15, pag. 133].

4.4 Operator norm

Following the approach in [11, Section 4] and [15, Section 6.1], we estimate a bound on the norm of the linear operator $\operatorname{\bm{\nabla}}_{\pazocal{M}^{h},\pazocal{W}}^{q,h}$ in Eq. 26 in view of the implementation of a suitable primal-dual algorithm. Let $\left\lVert\,{\cdot}\,\right\rVert$ be the operator norm, then for each $q=1,\dots,\mathrm{Q}$ we have

[TABLE]

In the two-dimensional setting, when $q=1$ then $\operatorname{div}_{\pazocal{M}^{h},\pazocal{W}}^{1,h}{\bm{\Psi}}^{h}$ in Eq. 41 reduces to

[TABLE]

and by applying the finite difference scheme in Eq. 24, from $\left\lVert\operatorname{\bm{\nabla}}\right\rVert\leq\sqrt{8}h^{-1}$ we estimate:

[TABLE]

For a fixed $q$ , since it holds

[TABLE]

then the operator norm $L_{q}$ is estimated via

[TABLE]

Remark 4.1.

Since $\pazocal{W}$ is made by partition of unit transfer operators and $\left\lVert\mathbf{M}_{j}^{h}\right\rVert_{2}\leq 1$ by construction, we can estimate the right-hand side of Eq. 42 as:

[TABLE]

*which agrees with the classic isotropic setting given by the choices $\mathbf{M}_{j}^{h}=\mathbf{I}$ without the use of $\pazocal{W}^{j}$ for every $j=1,\dots,q$ . Indeed, we have $L^{2}\leq 8h^{-2}$ for $\mathrm{{TV}}{}$ and $L^{2}\leq 64h^{-4}$ for $\mathrm{{TGV}}_{{\bm{\alpha}}}^{2}$ . *

4.5 Primal-Dual Hybrid Gradient algorithm

Now we are ready for solving Eq. 39 with a Primal-Dual Hybrid Gradient (PDHG) algorithm following [15]. Let $q$ be fixed and $L_{q}=\left\lVert\pazocal{K}\right\rVert$ be the operator norm as in Section 4.4, i.e. $L_{q}:=\sup_{\bm{z}}\left\{\left\lVert\pazocal{K}{\bm{z}}\right\rVert_{2}\text{ s.t.\ }\left\lVert{\bm{z}}\right\rVert_{2}\leq 1\right\},$ and let $\tau,\sigma>0$ , $\omega\in[0,1]$ be fixed, such that $\tau\sigma L_{q}^{2}<1$ . Then the PDHG algorithm [15, Algorithm 1] reads as the iteration of

[TABLE]

where we denoted with an index $n$ the iterations, starting from admissible ${\bm{z}}^{0,h}$ and ${\bm{w}}^{0,h}$ . The final solution is achieved by ${\bm{u}}^{\star}={\bm{z}}_{0}^{n+1,h}$ . Compare Algorithm 1 for details.

Acceleration

If $\pazocal{S}=\mathbf{I}$ and only the first order regulariser is involved ( $\mathrm{Q}=1$ ) then the fidelity term $G$ is strongly convex with convexity parameter $\eta$ (since it does not involve the terms of ${\bm{z}}^{h}$ related to the derivatives of order greater than 1) and the dual problem is smooth. Therefore, it is possible to accelerate the PDHG algorithm with [15, Algorithm 2]: we can take $\tau_{0}\sigma_{0}L_{q}^{2}\leq 1$ and update $\tau_{n},\sigma_{n},\omega_{n}$ by taking $L_{G}$ as the Lipschitz constant of $G$ , $\tau_{0}=L_{q}^{-1}$ and $\gamma=0.5L_{G}^{-1}$ , with the update rule in Eq. 43 before $\overline{{\bm{z}}}^{n+1,h}$ reading as

[TABLE]

When $\pazocal{S}=\mathbf{I}$ and $\mathrm{Q}>1$ then $G$ is only partially strongly convex and one can use either [15, Algorithm 1] or the acceleration proposed in [55]. In any case, when $\pazocal{S}$ makes $G$ not strongly convex then the use of [15, Algorithm 1] is recommended: in such case, $\sigma_{n}$ and $\tau_{n}$ are fixed a-priori, e.g. the authors in [32] adopted the parameters $\sigma_{n}=\tau_{n}=1/\sqrt{12}$ for the second-order regulariser $\mathrm{{TGV}}^{2}$ and for a grid of grid-size $h=1$ .

4.6 Primal-Dual Gap

As exit condition for the primal-dual algorithm of the $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q,h}-\mathrm{{L}}^{2}$ minimisation problem, it is possible to either define a maximum number of iterations to reach or to impose a threshold for the primal-dual gap, defined for the current solutions $\overline{{\bm{z}}}$ and $\overline{{\bm{w}}}$ as

[TABLE]

4.7 Joint minimisation problem

In the continuous setting and for any fixed $q>0$ , the particular choice of $\pazocal{M}=(\mathbf{I},\dots,\mathbf{I},\mathbf{M})$ and ${\bm{\alpha}}=(\alpha_{q,0},+\infty,\dots,+\infty)$ deserves a separate discussion. Indeed, the decomposition of $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q}$ in the continuous setting results as

[TABLE]

in which the sparsity of the inner order of derivatives would not be fully exploited due to the weight $+\infty$ . The above decomposition is equivalent to impose ${\bm{z}}_{j}=\operatorname{\bm{\nabla}}{\bm{z}}_{j-1}$ for any $j=1,\dots,q-1$ , resulting in $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q}({u,\pazocal{M}})\equiv\left\lVert\mathbf{M}\operatorname{\bm{\nabla}}^{q}u\right\rVert_{2,1}$ . As discussed in Remark 3.4, in this case the usage of the transfer operators in the discrete setting and for the inner order of derivates is useless and we can therefore discretise the above regulariser as

[TABLE]

In our applications, we will mainly focus on the effect of weighting the highest order derivative by means of taking the exact inner derivatives, but combined jointly with regularisers $\mathrm{{TDV}}_{{{\bm{\alpha}}_{q}}}^{q}$ for different $q>0$ , i.e. we aim to solve in the discrete setting

[TABLE]

with $\pazocal{M}_{q}^{h}=(\mathbf{I},\dots,\mathbf{I},\mathbf{M}_{q}^{h})$ and ${\bm{\alpha}}_{q}=(\alpha_{q,0},+\infty,\dots,\infty)$ for each $q=1,\dots,\mathrm{Q}$ , leading to

[TABLE]

Then it is possible to reduce Algorithm 1 to Algorithm 2, where the accelerated PDHG [15, Algorithm 2] can be used for any choice of $\pazocal{S}$ that makes $G$ strongly convex.

5 Imaging Denoising

In what follows we demonstrate the performance of the introduced regulariser $\mathrm{{TDV}}_{{{\bm{\alpha}}_{q}}}^{q}$ for the applications of image denoising. We focus in particular on the cases $q=1,2,3$ for the single Eq. 29 and joint minimisation model Eq. 45. Results are computed on a standard laptop (MATLAB R2019a, MacBook Pro 13'', 2.9 GHz Intel Core i5, 8 GB 1867 MHz DDR3). The code is freely available at the authors' webpage222Code freely available at https://github.com/simoneparisotto.

Let $\Omega\subset\mathbb{R}^{2}$ , $u:\Omega\to\mathbb{R}$ be a grey-scale image (colour images ${\bm{u}}:\Omega\to\mathbb{R}^{3}$ are considered one colour channel at time), ${\bm{v}}:\Omega\to\mathbb{R}^{2}$ a field with $\left\lVert{\bm{v}}\right\rVert_{2}=1$ and $u^{\diamond}$ a given noisy image.

5.1 Estimation of vector field ${\bm{v}}$

For estimating ${\bm{v}}$ we use the following strategy. Let $\sigma,\rho>0$ . Let $\lambda_{1}({\bm{x}}),\lambda_{2}({\bm{x}})$ be such that $\lambda_{1}({\bm{x}})\geq\lambda_{2}({\bm{x}})$ , the ordered eigenvalues of

[TABLE]

and ${\bm{e}}_{1},{\bm{e}}_{2}\in\mathbb{R}^{2}$ the associated eigenvectors. Let $\widetilde{{\bm{v}}}({\bm{x}})={\bm{e}}_{2}({\bm{x}})$ be the local direction of the anisotropy, corresponding to an approximation of $(\operatorname{\bm{\nabla}}^{\perp}u)/\left\lVert\operatorname{\bm{\nabla}}^{\perp}u\right\rVert$ . In order to compute a vector field smoother than $\widetilde{{\bm{v}}}$ , we adopt a further regularisation step, similarly as in [34].

Let $w({\bm{x}})\in[0,1]$ . We aim to smooth the vector field where the anisotropy weight $w({\bm{x}})$ is close to 0 while keeping the already computed vector field in regions with strong anisotropy. This is equivalent to solving the following problem:

[TABLE]

We use the local estimation of the anisotropy as weights $w({\bm{x}})$ , for $\varepsilon>0$ :

[TABLE]

We can use $w({\bm{x}})$ also to vary locally ${\bm{b}}=(1,\beta({\bm{x}}))$ : we have already seen that the process is more isotropic as $\beta({\bm{x}})$ is closer to 1. For this reason, a possible strategy to vary $\beta({\bm{x}})$ is:

•

first, estimate the anisotropy (values close to 1 correspond to isotropic regions) by:

[TABLE]

•

second, rescale in $[0,1]$ to define $\beta$ :

[TABLE]

With this strategy, the higher the image anisotropy the closer ${\bm{b}}$ is to $(1,0)$ : in such cases strong directional structures are emphasised by our directional regulariser. Conversely, when ${\bm{b}}=(1,1)$ , isotropic smoothing is performed in flat regions. In the following, we will require that ${\bm{b}}=(1,\beta({\bm{x}}))^{\mathrm{T}}$ for every ${\bm{x}}\in\Omega$ , thus

[TABLE]

where ${\bm{v}}$ is a given estimated field. Also, we may refine ${\bm{v}}$ by updating the parameters $(\sigma,\rho)$ so as to restart the denoising problem with a better estimation of the vector field.

5.2 Single minimisation model

Here we describe results with the single model $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q}-\mathrm{{L}}^{2}$ and fixed $q$ to be chosen between $1,2$ and $3$ . We will explore all the possible combinations for $(\mathbf{M}_{1},\dots,\mathbf{M}_{q})$ , where each $\mathbf{M}_{j}$ will be either the identity $\mathbf{I}$ or the anisotropic $\mathbf{M}$ weights.

Numerical results

In Fig. 4 we report the results from the single $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q}-\mathrm{{L}}^{2}$ problem in Eq. 29, with fixed $q=1,2,3$ and different choices for the anisotropy $\pazocal{M}_{q}$ . To compute $\mathbf{M}$ , we employ the spatially varying strategy based on the structure tensor eigen-decomposition described in Section 5.1, with $(\sigma,\rho)=(2,25)$ and ${\bm{b}}=(1,\beta({\bm{x}}))$ , where $\beta({\bm{x}})$ follows from Eqs. 48, 49 and 50.In these experiments, we keep fixed $\eta=1$ , $\alpha_{0}=1$ and $\alpha_{j}=1.25$ for $j=1,\dots,q$ and we run Algorithm 1 for $\texttt{maxiter}=1000$ . We compare our results with $\mathrm{{TGV}}_{{\bm{\alpha}}}^{2}$ (with the same weights) from an online source333Code available at www.gipsa-lab.grenoble-inp.fr/~laurent.condat/download/TGVdenoise.m: our strategy is able to encode spatially varying directional information at all the derivative orders.

For the experiments of Fig. 5, we use the same anisotropy directions as in the experiments of Fig. 4, but we modulate the anisotropy weight $\beta$ to test its impact on the result. More precisely, starting from the same anisotropy matrix $\mathbf{M}$ of Fig. 4, we define two variants: $\mathbf{M}^{0.3}$ is obtained from $\mathbf{M}$ by replacing ${\bm{b}}=(1,\beta({\bm{x}}))$ with ${\bm{b}}_{0.3}=(1,0.3\beta({\bm{x}}))$ , and $\mathbf{M}^{0.7}$ is obtained from $\mathbf{M}$ by replacing ${\bm{b}}$ with ${\bm{b}}_{0.7}=(1,0.7\beta({\bm{x}}))$ . The results of Fig. 5 correspond to different anisotropic weighting of the first and second-order derivatives in the model $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{2}-\mathrm{{L}}^{2}$ , see the figure caption for details. Interestingly, a better PSNR is obtained with two orders of derivation and the following strategies: if the first derivative is anisotropically weighted, it is better to opt for an isotropic weight of the second-order derivative. And if the first derivative is isotropically weighted, it is better to opt for a strong anisotropic weight of the second-order derivative. This latter case seems the best in terms of PSNR but, of course, sharper results are obtained when anisotropic weights are chosen. Obviously, the PSNR metric does not capture all aspects of an image and a visual evaluation by the end user remains necessary for the choice of the most suitable combination of derivative orders and anisotropic weights.

Remark 5.1.

*In our experiments we observed some checkerboard artifacts when employing strong anisotropies. This phenomenon has been attributed to spectral properties of finite differences [59, 25] and a non-negative stencil avoiding these issues has been introduced in [25] for the case of directional Hessian. We leave a generalization of [25] to our higher-order case for future work. The artifacts are not evident in the results of the next Section 5.3, where the joint model and a specific choice of the weights help in producing better quality results. *

Our approach has a reasonably large number of parameters depending on the noise level and the structural imaging information: the order of the derivatives, the spatially varying directional information from the structure tensor and the weights for the sparsity of the derivatives. This makes the model highly customisable, but also quite sensitive to parameters. In order to simplify the parameter choice, we detail in Section 5.3 a simplified joint model with a suggested choice of parameters, particularly suited for images with large dominant structures, as well as for the reconstruction of images from scattered data, see Section 6.2.

5.3 Joint minimisation model

Following the comments in Remark 5.1, we now consider the joint model in Eq. 13 and described in Section 4.7 for $\mathrm{Q}=3$ , with $\pazocal{M}_{q}=(\mathbf{I},\dots,\mathbf{I},\mathbf{M})$ and ${\bm{\alpha}}_{q}=(\alpha_{q,0},+\infty,\dots,+\infty)$ for each $q=1,2,3$ . Since we consider the denoising problem, we will make use of the identity operator $\pazocal{S}=\mathbf{I}$ in the fidelity term, aiming to solve:

[TABLE]

Thus, the denoising problem in equation Eq. 51 can be simplified as

[TABLE]

and solved via the primal-dual Algorithm 2 in which the computations for $u$ and ${\bm{v}}$ are performed alternatingly as described in Algorithm 3 so as to update the estimation of the directions ${\bm{v}}$ . This updating strategy will allow a refinement for the computation of the directionality in images.

Numerical Results

We discuss denoising results obtained with Algorithm 3 and different denoising approaches (the non-local method BM3D with normal-complexity profile and prior knowledge of the standard deviation of the Gaussian noise [19] and the regularisers $\mathrm{{TV}}{}$ [45], $\mathrm{{TGV}}^{2}$ [11], $\mathrm{{DTV}}{}$ and $\text{DTGV}^{2}$ [20]) for images with strong directional features. We run the primal-dual Algorithm 2 for $500$ iterations and we restarted the Algorithm 3 once the first denoised image is computed which serves as an oracle for a better estimation of ${\bm{v}}$ . We show results for grey-scale images in Fig. 8 and for colour images Figs. 10 and 11 and we discuss their PSNR.

Bamboo image

The grey-scale image in Fig. 6(a) shows a strong directional direction. In Fig. 6(b) it has been corrupted by 20% of Gaussian noise using the same random seed as in [20], see Fig. 6(b). In Fig. 6 we report the results from state of the art approaches, as reported in [20], where $\mathrm{{DTV}}{}$ and $\mathrm{{DTGV}}$ are considered with a single fixed choice of anisotropy direction and minor semi-axis (our parameter $\beta$ ). In Figs. 6(d) and 6(e) the staircasing effect is visible as expected while Fig. 6(f) and Fig. 6(g) seem more promising, even if obtained with a single fixed direction only.

In our approach we vary the spatial directions estimating the vector field ${\bm{v}}$ as described in Eq. 46–Eq. 50 while we fix $\beta(\cdot)$ , so as to fix the elliptic shape of the test functions. First, in Fig. 7 we report the sensitivity to the parameter $\eta\in[0.24,4]$ (with step-size increment of $0.25$ ) for the first order $\mathrm{{TDV}}$ regulariser: from our experiments $\eta=0.75$ produces a better result than the first order regularisers, $\mathrm{{TV}}{}$ in Fig. 6(d) and $\mathrm{{DTV}}{}$ in Fig. 6(f), and the second order $\mathrm{{TGV}}^{2}$ in Fig. 6(e), showing less staircasing and directional artefacts. In this test, we fixed a-priori a number of parameters, i.e. ${\bm{b}}=(1,0.02)$ , $(\sigma_{1},\rho_{1})=(1.8,2.8)$ and $(\sigma_{2},\rho_{2})=(1,1)$ for the anisotropic structure in building $\mathbf{M}$ and $\texttt{maxiter}=2$ in Algorithm 3.

In Fig. 8 we report the best results obtained with the same fixed choice of parameters but now with all the possible combinations of first, second and third order regularisers, as well as the sketch of the streamlines of ${\bm{v}}$ in Fig. 8(a). We observed that the combination of the first and third order of regularisers outperforms the results from Fig. 6. In the next paragraph we comment about further choices for the parameters.

Selection of parameters for the bamboo image

Let the choice of the regulariser orders ${\bm{a}}$ and the maxiter in Algorithm 3 be fixed. We would like to estimate good parameters that directly affect the image reconstruction, namely the fidelity $\eta$ , the directional information provided by $(\sigma_{1},\rho_{1})$ of the structure tensor and $\beta$ from ${\bm{b}}=(1,\beta)$ for the anisotropic shape of the test functions (assumed for now fixed all over the imaging domain). Unfortunately, the structure tensor depends on the intrinsic image content and standard deviation of the noise.

However, for $\eta=3.5$ (among the $\eta\in[0.25,4]$ tested) we obtained the best result for the combination of the first and third order regularisers, i.e. ${\bm{a}}=(1,0,1)$ . Therefore we inspect more the PSNR obtained by fixing $\eta,{\bm{a}}$ and by changing both $\beta\in\{0,0.01,0.02,0.03\}$ and $(\sigma_{1},\rho_{1})$ in the range between $(1.5,3.5)$ : results in Fig. 9 show that the optimal parameters for this case are the ones producing Fig. 8(g) as output. Other strategies for estimating the parameters, including $\eta$ , would require to solve a bilevel problem, e.g. as in [21], or to solve the problem by updating the parameters with a greedy line-search approach onto many directional images, so as to extract a rule of thumb for the choice, e.g. as in [43].

A natural question now is whether allowing also $\beta(\cdot)$ to change in $\Omega$ improves the performances. We are going to answer this question in the next two experiments. However, since the scale of the directional texture in the next experiments are spatially different, we fix $\eta=3.5$ and focus on the promising case of combination of first and third order regularisers, i.e. ${\bm{a}}=(1,0,1)$ , with reasonable choice of $(\sigma_{1},\rho_{1})$ .

Rainbow image

The rainbow in Fig. 10(a) has been corrupted by $20\%$ of Gaussian noise in each color channel, see Fig. 10(b). Due to the particular structure of the image, an isotropic approach seems reasonable outside the rainbow while an anisotropic approach inside. This resulted in varying the $\beta$ parameter following equations Eq. 49-Eq. 50: in Fig. 10(d) the black pixels correspond to $\beta\approx 0$ and the white pixels to $\beta\approx 1$ . Indeed for $\beta\approx 0$ we expect to denoise the image following the anisotropy induced by ${\bm{v}}$ , while for $\beta\approx 1$ we expect to denoise the image isotropically in both ${\bm{v}}$ and ${\bm{v}}_{\perp}$ directions. In order to compute the vector field ${\bm{v}}$ as in Fig. 10(e), we did not apply the regularisation step Eq. 47 and we did not iterate Algorithm 3 since both the resulting ${\bm{v}}$ and ${\bm{b}}$ seem good enough for our purposes, performing better than BM3D in Fig. 10(c), with less wavy artefacts and a smoother global structure.

Desert image

The desert image in Fig. 11(a) is a mix of anisotropic and isotropic information. We denoised Fig. 11(b), corrupted again with $20\%$ of Gaussian noise in each color channel, with $(\sigma,\rho)=(3,1.5)$ and $\gamma=0.1$ to estimate ${\bm{v}}$ in Fig. 11(e) as described before. We also allowed $\beta(\cdot)$ to vary across the domain and we did not refine ${\bm{v}}$ with further iterations. Here, BM3D in Fig. 11(c) performed slightly better than our approach due to the wrong estimation of ${\bm{v}}$ along some dune waves: this is clearly visible in Figures 11(d) and 11(e) where both the wrong estimation of ${\bm{v}}$ and the isotropy requirement of $\beta$ result in a smoothing performance, as shown in the zooms of Figures 11(g) and 11(h). However, we recall that BM3D is a non-local method and better performances than local methods are expected. Nevertheless, since this kind of images have patterns and structures at different scales such that they cannot be captured by a single global structure tensor, we expect that the reconstruction quality can be improved, e.g. by a structure tensor workflow with locally adapting parameters $(\sigma,\rho)$ .

6 Other imaging applications with the joint model

In what follows we focus on the joint minimisation model Eq. 13 for the applications of image zooming and surface interpolation.

6.1 Wavelet Zooming

In this section we apply our regularisation to wavelet-based image zooming as in [10]. Here, the data fidelity term is modelled by a wavelet transformation operator. Let $\phi\in\mathrm{{L}}^{2}(\mathbb{R})$ , $\psi\in\mathrm{{L}}^{2}(\Omega)$ be the scaling and mother wavelet function, respectively. Then, a Riesz basis of $\mathrm{{L}}^{2}(\Omega)$ is obtained from translations and rotations of $\phi$ and $\psi$ . Here, we will consider only functions $\phi$ with compact support, yielding a compactly supported basis elements. Let $R\in\mathbb{Z}$ be a resolution level and $M_{R},(L_{j})_{j\leq R}$ be finite index sets in $\mathbb{Z}^{2}$ , then:

•

a Riesz basis of $\mathrm{{L}}^{2}((0,1)\times(0,1))$ is $(\phi_{R,{\bm{k}}})_{{\bm{k}}\in M_{R}}$ , $(\psi_{j,{\bm{k}}})_{j\leq R,{\bm{k}}\in L_{j}}$ ;

•

the dual Riesz basis of the above is defined as $(\widetilde{\phi}_{R,{\bm{k}}})_{{\bm{k}}\in M_{R}}$ , $(\widetilde{\psi}_{j,{\bm{k}}})_{j\leq R,{\bm{k}}\in L_{j}}$ .

Thus, the following decomposition holds:

[TABLE]

Let $u_{0}\in\mathrm{span}\left\{\widetilde{\phi}_{R,{\bm{k}}}\,\lvert\,{\bm{k}}\in M_{R}\right\}$ be a low resolution version of $u$ given by $((u_{0},\phi_{R,{\bm{k}}})_{2})_{{\bm{k}}\in M_{R}}$ , where the unknown $u$ is such that $(u,\phi_{R,{\bm{k}}})_{2}=(u_{0},\phi_{R,{\bm{k}}})_{2},\text{ for all }{\bm{k}}\in M_{R},$ and

[TABLE]

The wavelet-based zooming problem with higher order total directional regularisers reads as

[TABLE]

where $U_{D}=\left\{u\in\mathrm{{L}}^{2}(\Omega)\,\lvert\,(u,\phi_{R,{\bm{k}}})_{2}=(u_{0},\phi_{R,{\bm{k}}})_{2},\text{ for all }{\bm{k}}\in M_{R}\right\}$ and $\pazocal{I}_{U_{D}}$ is the convex indicator function w.r.t. $U_{D}$ , see [10] for more details. Since we did not downsample the original image, we avoided artefacts introduced by algorithms for reducing the image but at the same time no ground truth is available. Results based on CDF 9/7 wavelet are shown in Fig. 12 (grey-scale) and in Fig. 13 (colour).

6.2 Surface Interpolation

In this experiment, we aim to reconstruct a surface from scattered height data available in $\Omega$ . The available data lies on partially occluded isolines or on random points in $\Omega$ and the challenge is to interpolate them by preserving the anisotropic features via the reconstruction of a suitable vector field ${\bm{v}}$ . Before presenting our approach for this problem using $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{q}$ , we briefly review the state-of-the-art for surface interpolation.

Related works

The reconstruction of surfaces from scattered height values has been approached in two different ways in the literature: based on explicit and implicit models. Surface interpolation is sometimes also addressed as digital elevation map (DEM) problem.

In this paper we focus on implicit surface interpolation which has the advantage of being independent with respect to parametrization. Here the surface is an implicit function of height values over the domain. Two prominent methods in this range are the Thin Plate Spline (TPS) [36] and the Absolute Minimizing Lipschitz Extension (AMLE) [2] approach. TPS is a flexible approach since it can embed both grey values and gradient information. However, it has the drawback to be a fourth order isotropic method and the resulting interpolated surface is isotropically smooth. AMLE, on the other hand, is able to interpolate data given in isolated points and on curves but it fails to interpolate slopes of a surface, resulting in $\mathrm{{C}}^{1}$ , see [47].

For interpolating surfaces with sharp features, e.g. strong creases, and possibly non-smooth features, e.g. corners in a pyramid, it seems promising therefore to consider (higher-order) total variation ( $\mathrm{{TV}}$ ) regularisers for surface interpolation.

Our main model approach here is [34], where a third-order directional total variation regulariser has been proposed that reads for a given vector field ${\bm{v}}$ as

[TABLE]

where $\operatorname{\bm{\nabla}}_{\bm{v}}^{3}u=\operatorname{\bm{\nabla}}(\operatorname{\bm{\nabla}}^{2}u)\cdot{\bm{v}}$ is the directional derivative of the Hessian of $u$ , along ${\bm{v}}$ . Note that this is a special case of $\mathrm{{TDV}}_{{{\bm{\alpha}}}}^{\mathrm{Q}}$ with $\mathrm{Q}=3$ , $a=1$ , ${\bm{b}}=(1,0)$ , ${\bm{v}}=(\cos{\bm{\theta}},\sin{\bm{\theta}})$ , i.e. $\pazocal{M}=(\mathbf{I},\mathbf{I},\mathbf{M})$ and $\mathbf{M}={\bm{\Lambda}}_{\bm{b}}(\mathbf{R}_{\bm{\theta}})^{\mathrm{T}}$ , leading to $\operatorname{\bm{\nabla}}_{\bm{v}}^{3}u\equiv\mathbf{M}\operatorname{\bm{\nabla}}\otimes(\operatorname{\bm{\nabla}}^{2}u)$ .

The estimation of ${\bm{v}}$ is crucial to obtain a good quality result. In [34], ${\bm{v}}$ has been computed as a two step minimization-regularisation problem by solving firstly

[TABLE]

and then applying to $\widetilde{{\bm{v}}}$ the same regularisation step in Eq. 47, where $w({\bm{x}})$ is a weight chosen as the largest singular value of $K_{\sigma}\ast\operatorname{\bm{\nabla}}\left(\frac{\operatorname{\bm{\nabla}}u}{\left\lvert\operatorname{\bm{\nabla}}u\right\rvert}\right)$ and $\gamma$ is a regularising parameter smoothing the vector field where $u$ is almost planar and preserving the local values of $\widetilde{{\bm{v}}}$ for level lines of large curvature. As last step, ${\bm{v}}$ is normalised to be unitary.

Another directional interpolation model for $u$ and ${\bm{v}}$ appears in [8]: differently from our approach in this paper, it requires knowledge of the vector field ${\bm{v}}$ prior to the interpolation.

In this section, we generalize the approach of [34] for the reconstruction of a surface, given scattered height values lying (possibly) on partial contour lines. Differently from Section 5, the unitary vector field ${\bm{v}}$ is reconstructed in the missing domain as follows.

Let $\Omega^{h}$ be a 2D domain ( $d=2$ ) and $u^{\diamond,h}$ be sparse sampled height values. In the following, the projection onto the data available $u^{\diamond,h}$ is identified by the operator $\pazocal{S}$ . We aim to find the interpolated surface $u\in\mathbb{R}^{|\Omega^{h}|}$ , driven by the unitary directions ${\bm{v}}\in\mathbb{R}^{|\Gamma^{h}|}$ . Let $\pazocal{M}=(\pazocal{M}_{1},\dots,\pazocal{M}_{\mathrm{Q}})$ be a collection of weighting fields, where for a fixed $q$ each collection $\pazocal{M}_{q}$ is defined as $\pazocal{M}_{q}=(\mathbf{I},\dots,\mathbf{I},\mathbf{M})$ with explicit dependence on ${\bm{v}}$ . We solve by Algorithm 4, alternatingly:

[TABLE]

with the primal-dual in Algorithm 2 for Eq. 54 and a classic primal-dual for Eq. 55. In particular, in Eq. 55 we identify $F({\bm{v}})=\mathrm{{TV}}({\bm{v}})$ for regularising the vector field ${\bm{v}}$ and $G({\bm{v}})=\left\lVert 1-{\bm{v}}\cdot\frac{\operatorname{\bm{\nabla}}u}{\left\lvert\operatorname{\bm{\nabla}}u\right\rvert}\right\rVert_{\mathrm{{L}}^{2}}^{2}$ for normalising ${\bm{v}}$ in the direction of the normalised gradient [4].

Minimization with respect to $u$

Fixing an unitary vector field ${\bm{v}}^{t}$ , the minimization problem Eq. 54 is convex with respect to $u$ and the minimization problem can be solved via primal-dual Algorithm 2 without acceleration due to the lack of strong convexity of the projection map $\pazocal{S}$ , which results in a non-smooth dual problem.

Minimization with respect to ${\bm{v}}$

Fixing $u^{t+1}$ , the minimization problem Eq. 55 can be solved by the primal-dual algorithm with

[TABLE]

Let ${\bm{s}}=K{\bm{v}}$ , $K=\operatorname{\bm{\nabla}}$ and $K^{\ast}=-\operatorname{div}$ . Then, the proximal of $F^{\ast}$ , with $F({\bm{v}})=\mu\mathrm{{TV}}({\bm{v}})$ , is the projection onto the polar ball:

[TABLE]

The proximal map of $G$ at $\savestack{\tmpbox}{\stretchto{\scaleto{\scalerel*[width("{\bm{v}}")]{\kern-0.6pt\bigwedge\kern-0.6pt}{\rule[-505.89pt]{4.30554pt}{505.89pt}}}{}}{0.5ex}}\stackon[1pt]{{\bm{v}}}{\tmpbox}=(\savestack{\tmpbox}{\stretchto{\scaleto{\scalerel*[width("v")]{\kern-0.6pt\bigwedge\kern-0.6pt}{\rule[-505.89pt]{4.30554pt}{505.89pt}}}{}}{0.5ex}}\stackon[1pt]{v}{\tmpbox}_{1},\savestack{\tmpbox}{\stretchto{\scaleto{\scalerel*[width("v")]{\kern-0.6pt\bigwedge\kern-0.6pt}{\rule[-505.89pt]{4.30554pt}{505.89pt}}}{}}{0.5ex}}\stackon[1pt]{v}{\tmpbox}_{2})$ , for ${\bm{p}}=\operatorname{\bm{\nabla}}u^{t+1}/\left\lvert\operatorname{\bm{\nabla}}u^{t+1}\right\rvert=(p_{1},p_{2})$ , reads as

[TABLE]

thus

[TABLE]

Since $G$ is strongly-convex, we use the accelerated scheme Eq. 43, with $K$ instead of $\pazocal{K}$ .

Numerical Results

We tested Algorithm 4 in MATLAB on synthetic and real surfaces. Differently from [34], we did not use CVX or MOSEK, making our approach suitable for larger surfaces, beyond the variable size limit imposed by CVX. In what follows, we will use Eq. 52 for solving Eq. 54 and we will test both single and joint directional regularisers, namely ${\bm{a}}=(0,1,0)$ , ${\bm{a}}=(0,0,1)$ and ${\bm{a}}=(0,\alpha_{2,0},\alpha_{3,0})$ , with $\alpha_{2,0}$ and $\alpha_{3,0}$ to be chosen appropriately for the situation. For a better visualization of the results, a divergence RGB colormap in the range $([0.230,0.299,0.754],[0.706,0.016,0.150])$ has been applied.

Pyramid dataset from [34]

A pyramid with height data available on three contour lines and no extra information on the tip is given, so as to test whether our model can reconstruct it. We initialize $u^{0}$ and ${\bm{v}}^{0}$ randomly. In Fig. 14, we report in the first column the location of the available data (top) and the ground truth (bottom); in the second column the random initialization of ${\bm{v}}^{0}$ (top) and $u^{0}$ (bottom); in the third, fourth and fifth columns we report the results from different orders of directional regularisers, namely ${\bm{a}}=(0,1,0)$ , ${\bm{a}}=(0,1,0)$ and ${\bm{a}}=(0,1,1)$ , with one level of anisotropy $a=1$ . The similarity of the resulting vector fields in Fig. 14, despite the different derivative orders involved in the minimisation with respect to $u$ , shows the robustness of the computation of ${\bm{v}}$ for such problem. Visual results suggest that a combination between second and third order directional regularisers, e.g. ${\bm{a}}=(0,\alpha_{2,0},\alpha_{3,0})$ , is desirable since it smooths the second-order result without loosing its features.

SRTM dataset from [54]

This dataset is part of the Shuttle Radar Topography Mission (SRTM) [54] NASA mission so as to obtain elevation data for most areas of the world. We download .hgt ``height'' binary data files from [6], where by selection of latitude and longitude coordinates we get 1x1 degree tiles of 1-arc seconds resolution (around $30\text{\,}\mathrm{m}$ per pixel). We selected some famous mountains within Italy in Fig. 15: Etna volcano (Sicily), Baldo Mountain (Verona), Vesuvio volcano (Naples), Brenner border (Sterzing) and Gran Sasso (L'Aquila), with image size domain of $250\times 250$ pixels. As input, we randomly sampled about $7\%$ of sparse data on level lines and isolated points, to be interpolated with $0.1\mathrm{{TDV}}^{2}+\mathrm{{TDV}}^{3}$ .

Atomic Force Microscopy dataset from [44]

Atomic force microscopy (AFM), or scanning probe microscopy (SPM), is a topography imaging technique commonly used in the detection of cancer cells in cellular biology: it scans objects at high resolution while recording their topographical information. In [39], the study of a compressed sensing approach on AFM images was motivated by the reduction of the image acquisition time for multiple reasons, e.g. to minimize the operator time spent at the equipment [28], to allow time-dependent dynamic processes [48] and to minimize the interaction of instruments with specimens so as to reduce potential risks of damages [37]. Therefore, the authors proposed to speed up the sampling procedure by scanning height data on spirals rather than exploring pixel by pixel, so as to reconstruct the missing data via compressed sensing. The authors define the under-sampling ratio as $\rho=L/L_{ref}$ , where $L$ is the length of the spiral path followed by the probe for acquiring the data and $L_{ref}$ is the distance travelled by the probe in pixels during the raster scan. Note that $L$ also counts the path outside the imaging domain due to smooth movement requirement of the probe, while $L_{ref}$ is approximated by the value $2\cdot\#\text{pixels}$ and the factor of two is due by the usual approach to acquire two topography buffers, even if only one is used for the visualization. In order to test our reconstruction method based on the directional regularisers, we downloaded the open source AFM .mi dataset of $512\times 512$ height values from [44], exported in ASCII text via the open-source software Gwyddion and imported in MATLAB. Our input are AFM surfaces of size $256\times 256$ obtained by slicing the orginal surface, for comparison purposes following [39]. We show the results in Fig. 16 for the ground truth image in Fig. 16(a), with different under-sampling ratio $\rho$ , see Figs. 16(c) and 16(d). In Fig. 16(b) we compare the structural similarity index (SSIM) [56] for our results (producing the black line of scores on the top of the graph) with [39, Figure 7], where iterative hard thresholding (IHT), iterative soft tresholding (IST), their weighted version (w-IHT and w-IST) and Basis Pursuilt Denoising (BPDN) were tested: we conclude that our approach is robust throughout different under-sampled data, with good quality surfaces in terms of SSIM.

7 Conclusions

In this work, we have shown that embedding anisotropic directional information into higher order derivatives improves the performance of total variation regularisation in many imaging applications where anisotropy plays a crucial role. In particular, we presented results for image denoising, image zooming and interpolation of scattered measurements, with details on the numerical discretisation and the solution via a primal-dual hybrid gradient algorithm. Among the range of experiments provided, we emphasise that our approach is particularly suitable for the reconstruction task from scattered data, motivating the interest in studying the proposed energy. With this we provided a precise discrete framework which extends the works [34, 20, 11, 10], bringing higher-order total variation together with spatially-varying anisotropy. The continuous model is analysed in the companion paper [42],

Acknowledgements

The authors are grateful to Dr. Martin Holler, University of Graz (Austria) for the useful discussions and to Prof. Thomas Arildsen, Aalborg University (Denmark) for the AFM data.

Bibliography62

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] T. Aach, C. Mota, I. Stuke, M. Muhlich, and E. Barth , Analysis of Superimposed Oriented Patterns , IEEE Transactions on Image Processing, 15 (2006), pp. 3690–3700, https://doi.org/10.1109/TIP.2006.884921 . · doi ↗
2[2] A. Almansa, F. Cao, Y. Gousseau, and B. Rouge , Interpolation of digital elevation models using AMLE and related methods , IEEE Transactions on Geoscience and Remote Sensing, 40 (2002), pp. 314–325, https://doi.org/10.1109/36.992791 . · doi ↗
3[3] L. Alvarez, F. Guichard, P.-L. Lions, and J.-M. Morel , Axioms and fundamental equations of image processing , Archive for Rational Mechanics and Analysis, (1993).
4[4] C. Ballester, M. Bertalmio, V. Caselles, G. Sapiro, and J. Verdera , Filling-in by joint interpolation of vector fields and gray levels , IEEE Transactions on Image Processing, 10 (2001), pp. 1200–1211, https://doi.org/10.1109/83.935036 . · doi ↗
5[5] I. Bayram and M. E. Kamasak , Directional total variation , IEEE Signal Processing Letters, 19 (2012), pp. 781–784, https://doi.org/10.1109/LSP.2012.2220349 . · doi ↗
6[6] F. Beauducel , READHGT: Import/download NASA SRTM data files (.HGT) , 2012, https://www.mathworks.com/matlabcentral/fileexchange/36379 .
7[7] B. Berkels, M. Burger, M. Droske, O. Nemitz, and M. Rumpf , Cartoon Extraction Based on Anisotropic Image Classification , in Vision, Modeling, and Visualization Proceedings, 2006, pp. 293–300, http://numod.ins.uni-bonn.de/research/papers/public/Be Bu Dr 06.pdf .
8[8] T. R. Bin Wu and X.-C. Tai , Sparse-data based 3D surface reconstruction for cartoon and map , Internal report, UCLA, (2017), ftp://ftp.math.ucla.edu/pub/camreport/cam 17-38.pdf .

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Code & Models

Videos

Higher-Order Total Directional Variation: Imaging Applications

Abstract

keywords:

1 Introduction

1.1 Related work

1.2 Our proposal

Remark 1.1**.**

1.3 Contribution of the paper

1.4 Organization of the paper

2 Higher-order total directional variation

2.1 Tensors

Definition 2.1**.**

Remark 2.2**.**

2.2 Definition of total directional variation

Lemma 2.3**.**

Lemma 2.4**.**

Definition 2.5**.**

Remark 2.6**.**

2.3 Directional matrices for applications

Definition 2.7** (Directional matrices).**

Definition 2.8** (Weighted derivatives of order 1).**

Remark 2.9**.**

Remark 2.10**.**

Definition 2.11** (Weighted derivatives of order qqq).**

2.4 Examples

3 Numerical discretisation

3.1 Staggered grids

3.2 Discretised objects

3.3 Isotropic differential operators

3.4 Transfer operators

Example 3.1**.**

Remark 3.2**.**

Remark 3.3**.**

3.5 Anisotropic differential operators

Remark 3.4**.**

4 Numerical optimisation

4.1 Discrete characterisation of TDV

4.2 Discretised single minimization problems

4.3 Proximal operators

4.4 Operator norm

Remark 4.1**.**

4.5 Primal-Dual Hybrid Gradient algorithm

Acceleration

4.6 Primal-Dual Gap

4.7 Joint minimisation problem

5 Imaging Denoising

5.1 Estimation of vector field v{\bm{v}}v

5.2 Single minimisation model

Numerical results

Remark 5.1**.**

5.3 Joint minimisation model

Numerical Results

Bamboo image

Selection of parameters for the bamboo image

Rainbow image

Desert image

6 Other imaging applications with the joint model

6.1 Wavelet Zooming

6.2 Surface Interpolation

Related works

Minimization with respect to uuu

Minimization with respect to v{\bm{v}}v

Numerical Results

Pyramid dataset from [34]

SRTM dataset from [54]

Atomic Force Microscopy dataset from [44]

7 Conclusions

Acknowledgements

Remark 1.1.

Definition 2.1.

Remark 2.2.

Lemma 2.3.

Lemma 2.4.

Definition 2.5.

Remark 2.6.

Definition 2.7 (Directional matrices).

Definition 2.8 (Weighted derivatives of order 1).

Remark 2.9.

Remark 2.10.

Definition 2.11 (Weighted derivatives of order $q$ ).

Example 3.1.

Remark 3.2.

Remark 3.3.

Remark 3.4.

Remark 4.1.

5.1 Estimation of vector field ${\bm{v}}$

Remark 5.1.

Minimization with respect to $u$

Minimization with respect to ${\bm{v}}$