TL;DR
This paper introduces higher-order anisotropic total variation regularisers that extend TGV, enabling better preservation of anisotropic features in images across various applications like denoising and surface reconstruction.
Contribution
It presents a new class of regularisers for imaging that generalize TGV to anisotropic, inhomogeneous cases, with a numerical approach for gradient flow approximation.
Findings
Enhanced anisotropic feature preservation in images
Effective application to denoising, zooming, and surface reconstruction
Numerical method demonstrates practical viability
Abstract
We introduce a class of higher-order anisotropic total variation regularisers, which are defined for possibly inhomogeneous, smooth elliptic anisotropies, that extends the Total Generalized Variation (TGV) regulariser and its variants. We propose a primal-dual hybrid gradient approach to approximate numerically the associated gradient flow. This choice of regularisers allows to preserve and enhance intrinsic anisotropic features in images. This is illustrated on various examples from different imaging applications: image denoising, wavelet-based image zooming, and reconstruction of surfaces from scattered height measurements.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
\headers
Higher-Order Total Directional Variation: Imaging ApplicationsS. Parisotto, J. Lellmann, S. Masnou and C.-B. Schönlieb
\stackMath
\epstopdfDeclareGraphicsRule.tifpng.pngconvert #1 \OutputFile \AppendGraphicsExtensions.tif \epstopdfDeclareGraphicsRule.tifpng.pngconvert #1 \OutputFile \AppendGraphicsExtensions.tif
\newsiamremarkexampleExample \newsiamremarkremarkRemark
Higher-Order Total Directional Variation: Imaging Applications
††thanks: Submitted to the editors DATE. \funding SP acknowledges UK EPSRC grant EP/L016516/1 for the University of Cambridge, Cambridge Centre for Analysis DTC. CBS acknowledges support from the EPSRC grants Nr. EP/M00483X/1, EP/K009745/1, the EPSRC centre EP/N014588/1, the Leverhulme Trust project ’Breaking the non-convexity barrier’, the Alan Turing Institute TU/B/000071, the CHiPS (Horizon 2020 RISE project grant), the Isaac Newton Institute and the Cantab Capital Institute for the Mathematics of Information. SM acknowledges support from the Labex MILYON/ANR-10-LABX-0070 and the ANR-14-CE27-0019 MIRIAM project grant
Simone Parisotto CCA, University of Cambridge, Wilberforce Road, Cambridge CB3 0WA, UK () [email protected]
Jan Lellmann MIC, University of Lübeck, Maria-Goeppert-Straße 3, 23562 Lübeck, DE ([email protected])
Simon Masnou Université Claude Bernard Lyon 1, Institut Camille Jordan, Lyon, France () [email protected]
Carola-Bibiane Schönlieb DAMTP, University of Cambridge, Wilberforce Road, Cambridge CB3 0WA, UK ([email protected])
Abstract
We introduce a class of higher-order anisotropic total variation regularisers, which are defined for possibly inhomogeneous, smooth elliptic anisotropies, that extends the Total Generalized Variation (TGV) regulariser and its variants. We propose a primal-dual hybrid gradient approach to approximate numerically the associated gradient flow. This choice of regularisers allows to preserve and enhance intrinsic anisotropic features in images. This is illustrated on various examples from different imaging applications: image denoising, wavelet-based image zooming, and reconstruction of surfaces from scattered height measurements.
keywords:
Total directional variation, Anisotropy, Denoising, Wavelet-based zooming, Digital Elevation Map
{AMS}
47A52, 49M30, 49N45, 65J22, 94A08
1 Introduction
In the last decades, total variation () regularisation has been successfully applied to a variety of imaging problems. In particular since [45], plays a crucial role for variational image denoising, deblurring, inpainting, segmentation, magnetic resonance image (MRI) reconstruction and many others, see [12]. While the regulariser successfully eliminates noise and at the same time preserves characteristic image features like edges, it still has some shortcomings. A major one is the staircasing effect, resulting into blocky-like images [13, 38]. One approach to mitigate this effect is based on higher order total variation regularisers, see e.g. [17, 18, 40, 49, 60], aiming to eliminate the staircasing effect by higher regularity in homogeneous regions of the image while still allowing for discontinuities in the presence of edges. The total generalized variation () regulariser has been proposed in [11] to balance the first derivatives of with a regularisation parameter vector . Another modification of the regulariser has been the introduction of directional information in the regularisation, allowing to smooth images in an anisotropic fashion favouring preferred directions, e.g. [5, 62, 7, 22, 51, 27, 35, 33, 24, 23]. A recent combination of directional and higher-order derivatives is the directional total generalized variation [20] that equips the regulariser with one constant preferred smoothing direction.
In this paper we extend the directional total generalized variation introduced in [20] and the third-order directional total variation regulariser introduced in [34] to a new class of directional total variation regularisers that can feature a combination of orders of derivatives as well as spatially-varying directional information by means of weighting the derivatives in the regulariser with -tensors. We introduce this class of total generalized variation regularisers, discuss its numerical solution by a tailored primal-dual hybrid gradient scheme (which replaces the commercial CVX+Mosek solver used in [34]), and showcase its performance and regularisation properties for a range of imaging problems. For the latter, we show the effect of this generalized class of regularisers in different imaging applications where the introduced anisotropy plays a crucial role: image denoising, wavelet-based zooming and digital elevation map (DEM) interpolation with applications to atomic force microscopy (AFM) data, see Fig. 1. For the application of our regularisers to video denoising, we refer to [43] and in general to [41]. The theoretical foundation for this general class of directional, higher-order regularisers is presented in our companion work [42].
Let us go into more details. Let be a bounded Lipschitz domain for and a function, we define the the higher-order directional total variation of as
[TABLE]
where we call the order of the regularisation, is a collection of weighting fields, and
[TABLE]
where is the vector space of -tensors in and is a vector of regularisation parameters. We will provide the rigorous definition for Eq. 1 in Section 2.2. We comment for now that the regulariser in Eq. 1 is designed for introducing weighted directional derivatives in the classical definition of . The anisotropy is introduced by a family of weights and a thereby suitably weighted divergence of order , defined in Eq. 21.
1.1 Related work
In what follows we review the state-of-the-art that is most relevant for the proposed higher-order directional total variation regulariser. We focus in particular on functional regularisers but it is worth mentioning that there is a rich literature on fairly general anisotropic PDE models, mainly of first and second order, see for instance [46, 53, 57, 31, 9, 14, 3, 61] and the references therein. Our model handles a more limited class of anisotropies but it can do it at any order of derivatives, particularly useful in various applications.
The idea of anisotropic smoothing for imaging has certainly been popularized by the book of Weickert on anisotropic diffusion equations [57] based on the key notion of structure tensor to encode directional information, see also [29, 26, 30]. Weickert's structure tensor for a continuous imaging function and non-negative parameters is defined as
[TABLE]
where and are Gaussian kernels with standard deviations , respectively. For the structure tensor has two orthogonal eigenvectors and with corresponding non-zero real eigenvalues and . Here and approximately point in the direction and . From this, diffusion tensors can be constructed inheriting and as eigenvectors but whose eigenvalues are expressions of and so as to increase or reduce smoothing in these directions, compare for instance coherence-enhancing diffusion [58]. The concept of structure tensor is used for variational regularisation in [51] in the framework of a single orientation estimation approach. More precisely, the authors consider a regulariser of the type
[TABLE]
for a non-negative weight function and a continuous imaging function , for smoothing an image into a dominant single direction. For smoothing a noisy data in two directions, the authors propose to estimate directions and as in [1] and, in their double orientation estimation approach, decompose via
[TABLE]
An approach based on the analysis of eigenvalues and eigenvectors of the structure tensors can be found in [27] while in [35] the admissible set of test functions are locally adapted to the geometry of via the support function regulariser. Furthermore, in [33], the structure tensor total variation (STV) focuses on the nuclear norm of the structure tensor Eq. 3 in order to measure the local image variation:
[TABLE]
Also in [24], a regulariser is proposed whose smoothing directions vary according to the image content, leading to the analysis of
[TABLE]
where is the Huber regularisation with parameter and the structure tensor is eigen-decomposed as , with eigenvectors stored in the matrix and the eigenvalues in the diagonal matrix .
Let us also mention that early works where the gradient is weighted date back to [7], where the oriented local image structure is extracted from images by the regulariser
[TABLE]
with being the orthogonal rotation matrix for an angle .
Further, in [5], a discrete directional total variation () regulariser for denoising discrete images with a single dominant direction (directional images) is introduced via affine transformations of test functions: the circular unit ball generated by the -norm is transformed into an ellipse , with major semiaxis rotated by , penalizing variations for large along :
[TABLE]
In a straightforward generalization of Eq. 7, inhomogeneous fields are allowed, namely . In [62] the authors propose to adapt to the edge directions,
[TABLE]
so as to associate at each pixel position a specific ellipsoid ball for the test functions, leading to the discrete edge adaptive directional total variation (EADTV) regulariser:
[TABLE]
In [23], a discrete weighted directional Total Variation (dTV) regulariser is introduced as
[TABLE]
by projecting onto the complementary part of a vector field .
In [20], the continuous directional total variation () and directional total generalized variation () are analysed for a single homogeneously fixed angle and for the minor semi-axis (now denoted with ) of the ellipse . There, and are built upon test functions in the isotropic ball , so as to constrain to the ellipsoid ball, i.e. and where and are rotation and contraction matrices, respectively, similarly to our setting explained later, see Equation Eq. 14:
[TABLE]
By comparing Eq. 12 with Eq. 1, we immediately note that our proposed setting deals with non-symmetric test functions and directional information inhomogenously varying in , encoded in a weighted divergence term.
1.2 Our proposal
Our work extends the regularisers in Eq. 11-Eq. 12 for handling spatially varying directions in instead of a fixed scalar direction . We investigate the directional total variation regulariser of Eq. 1 and study its performance for a variety of image processing problems by solving
[TABLE]
where is a given, imperfect and possibly incomplete imaging data, and a linear operator. We consider the cases for which in from Eq. 1 is
[TABLE]
with the identity, the contraction and the rotation matrices, defined as:
[TABLE]
and with , . Occasionally, we will use the vector field and its orthogonal . Thus, we interpret the core operation of the dual version of the regulariser in Eq. 1, , as weighted directional derivatives of along and since
[TABLE]
We will in particular focus on the case for being either inhomogeneous or constant in , see Remark 1.1 for the geometrical interpretation when different choices are made for the constant.
Remark 1.1**.**
In Fig. 2 we simulate the two dimensional behaviour of Eq. 15 for different choices of . More precisely, for a continuous imaging function we represent a possible situation at the position of the vectors and , depicted with red and blue arrows, respectively. We also represent the components of by a green arrow. The vectors and the corresponding arrows are the same in all Fig. 2(a) to Fig. 2(f). Moreover, the test functions lie on the black circle due to the constraint . Note that in the 2D domain we have
[TABLE]
*which allows to change the metric space of the test functions into an elliptic ball in magenta. Being fixed , each figure corresponds to a particular choice of between 0 and 1. Finally, the magenta arrow corresponds to the direction of which realizes the supremum of the regulariser in Equation Eq. 1. We observe in Fig. 2(f) the limit case where penalizes the rate of change of only along without orthogonal contribution. In all the other circumstances, acts as quality estimation of , leading to a full isotropic approach in the case of Fig. 2(a), since the magenta arrow is bended in the direction of the gradient rather than the direction of . *
1.3 Contribution of the paper
In what follows we will derive:
- •
a rigorous definition of the total directional regulariser Eq. 1;
- •
a characterisation of Eq. 1 that turns Eq. 13 into a form that is amenable for numerical solution. For this we propose a primal-dual algorithm and present certain instances for different combinations of orders , up to in Eq. 1;
- •
a number of numerical experiments with this new regulariser for image denoising, image zooming and interpolation of two-dimensional surfaces from a sparse number of given height values.
1.4 Organization of the paper
In Section 2 we discuss the higher-order total directional variation regularisers with anisotropy. The numerical details of the discretisation are introduced in Section 3, with the primal-dual algorithm and the numerical optimisation described in Section 4. Imaging applications to denoising, wavelet-based zooming and surface interpolation, e.g. in atomic force microscopy imaging, are discussed in Section 5 and Section 6.
2 Higher-order total directional variation
In this section we introduce the rigorous definition of Eq. 1. To do so, we first introduce the terminology of tensors and their mathematical manipulation.
2.1 Tensors
Following [11], let be the vector space of -tensors defined as
[TABLE]
On , we have the following operations:
- •
let be the tensor product for , , with :
[TABLE]
- •
let be the trace of , with , defined by
[TABLE]
where is the -th standard basis vector;
- •
let be such that if , then ;
- •
let be such that if , then
- •
let . The space is equipped with the scalar product defined as
[TABLE]
We now introduce the derivative operator for tensors and its weighted version.
Definition 2.1**.**
Let be the derivative operator and . The derivative of is defined as via the following:
[TABLE]
Let . The derivative operator weighted by is defined as and the derivative of weighted by is defined as via the following:
[TABLE]
Remark 2.2**.**
*For notational purposes, the sum in Eq. 16 will be shortened using Einstein notation over the repeated subscript, meaning that each element of the tensor is written as . *
In what follows, we will also denote the space of -times uniformly continuously differentiable -valued tensors as which is a Banach space with the norm
[TABLE]
where , and we will consider also the space of -valued tensors which are -times continuously differentiable with compact support in .
2.2 Definition of total directional variation
For making sense of the distributional formulation of higher-order directional variation in Eq. 1 we need an integration by parts formula for the weighted derivative of tensors in Definition 2.1. Namely we consider
[TABLE]
with being a bounded Lipschitz domain, , and . We report in this section the main results from the second part of our companion work [42], where detailed proofs can be found. First, we give an integration by parts formula where only switches:
Lemma 2.3**.**
Let , , and as above. Then:
[TABLE]
Then a general adjoint property follows:
Lemma 2.4**.**
Let , , and as above. Then:
[TABLE]
*where *
We can now define the total directional variation of order with weights .
Definition 2.5**.**
Let , , , be a collection of fields in and be a positive weight vector. Then, the total directional variation of order , associated to and , is defined as:
[TABLE]
where
[TABLE]
and the weighted divergence of order is defined recursively, from Lemma 2.4, as:
[TABLE]
Remark 2.6**.**
*For , then , see [42] and [11, Remark 3.10]. *
2.3 Directional matrices for applications
In what follows, we introduce a particular parametrisation of directional matrices for fields in Eq. 19. For standard imaging applications, we will usually deal with grey-scale images , , i.e. .
Definition 2.7** (Directional matrices).**
Let , , be a collection of so-called contraction weights (being each element of modulus ), , , be a collection of angles, and and the associated contraction and rotation matrices defined, respectively, as
[TABLE]
Then we define to be a collection of contraction-rotation matrices (in Einstein notation) as
[TABLE]
*where are the element-wise entries of the matrices , , respectively. *
Definition 2.8** (Weighted derivatives of order 1).**
Let be the derivative operator. The gradient of a differentiable imaging function is given by and the weighted derivative operator of order 1 associated to the directional matrix from Definition 2.7 is
[TABLE]
Remark 2.9**.**
*If (i.e. and for all ), then . *
Remark 2.10**.**
Given , let and . Then
[TABLE]
where represents the directional derivative along a vector field , defined as
[TABLE]
Definition 2.11** (Weighted derivatives of order ).**
We define the derivative of order of using Definition 2.8 recursively as
[TABLE]
We define the weighted derivative of order of with respect to recursively as
[TABLE]
2.4 Examples
We present some examples of the total directional variation of order for , and a collection of directional matrices :
- •
order and :
[TABLE]
- •
order and :
[TABLE]
- •
order and :
[TABLE]
3 Numerical discretisation
The rest of the paper focuses on the discretised formulation of Eq. 13, and its numerical solutions and performances on a number of image processing variational examples. We start by discretising the problem in Eq. 13.
3.1 Staggered grids
The discretisation of the in Eq. 13 is based on finite-difference schemes for derivatives on staggered regular Cartesian grids of width :
- •
the grid of pixels , of axes and for a 2-dimensional domain and size , is defined as
[TABLE]
- •
the grid of cell centres , of size and used to perform the weighted derivative operation (i.e. for introducing the anisotropy, see the grid associated to the blue squares in Fig. 3), is defined as:
[TABLE]
- •
the collection of grids associated to the differential operators involved, where is a multi-index variable and for each indicates the partial derivative involved ( for and for ). Every is a sub-collection of grids, each one of size and denoted by , each one associated to a fixed choice for the derivative operator considered:
[TABLE]
where and are the cardinality of the sets and containing as many elements as the number of derivatives along the axes and , respectively. A visual representation of such grids is given in Fig. 3. For example:
- –
with a bit of abuse of notation, if then coincides with ;
- –
result in shifted by along and axes, respectively;
- –
if and , then we are referring to the grid associated to one out of the eight possible combinations for the third order derivative , namely , which is located on the grid identified by our notation .
3.2 Discretised objects
Let the order of derivatives be fixed. By means of the superscript , we define the finite-dimensional approximation of the following quantities, where , and are the number of grid points in , and , respectively:
- •
is the discretisation of the function ;
- •
is the discretisation of the observed imaging data ;
- •
is a discrete vector field;
- •
are discrete contraction weights for ;
- •
discretises the weights , for each ;
- •
discretises the collection of weights ;
- •
discretises the test functions ;
- •
discretises the primal variables , with , and each , for ;
- •
discretises the dual variables , with for .
3.3 Isotropic differential operators
Here we discuss the discretization of the adjoint unweighted operators and . For , the discrete gradient operator is defined as
[TABLE]
where we use the central second-order finite difference scheme on the grids :
[TABLE]
Let and let the discrete divergence operator
[TABLE]
be defined for each pixel via the central second-order difference scheme on :
[TABLE]
Thus, the isotropic discrete gradient and discrete divergence are designed to fulfil the discrete adjointness property, for every and :
[TABLE]
where and .
For higher-order derivatives of order we denote the isotropic discrete gradient and discrete divergence operator by and and write
[TABLE]
and
[TABLE]
The adjointness property is fulfilled for every and :
[TABLE]
with and .
3.4 Transfer operators
The offset in the location between and the fields associated to requires the introduction of transfer operators, a concept from multigrid methods [52], so as to make the quantities computable in the same location. In what follows, we will provide some insights for the general case.
Let be a family of transfer operators , with and a multi-index variable, with entries in similarly as for the staggered grids . The idea is that interpolates the data from the grids of -th order derivatives to the grid of cell centres , e.g. is the operator made by partition of unit weights. Since it is an averaging matrix, its adjoint operation is denoted by , where the extension from to the boundary of is made possible by mirroring the data as appropriate.
Example 3.1**.**
For and fixed, the derivatives of (up to order ) are
[TABLE]
the transfer operators are
[TABLE]
and each is the interpolation matrix that interpolates the values of to by an arithmetic mean. For example, for the first order derivatives we have
[TABLE]
where, for and ,
[TABLE]
As a further example for the second order derivative case and (which implies no averaging on the first derivatives for our construction), we have
[TABLE]
where and, for and ,
[TABLE]
*with and . *
Remark 3.2**.**
*The choice of the staggered grid increases the accuracy of the solution and allows to compute the inner products between gradients and the vector fields onto a unique regular Cartesian grid of reference, avoiding offsets. Moreover, the transfer operators reduce the bandwidth of higher order finite difference matrices, improving the quality of the result and reducing the smoothing due to large stencils. *
Remark 3.3**.**
*Note that when and for every , as in the applications described in Section 6 of this paper, the use of transfer operators is needed only for the outer derivative, i.e. the one associated to the weighting field . *
We report in Fig. 3 the positions of , up to order , in order to illustrate how transfer operators work in interpolating the data on .
3.5 Anisotropic differential operators
By construction, and so the grids have an ()-offset with respect to . In this case, locations of and are matched via the transfer operators .
From Remark 2.10, can be discretised in the correct grid position by the operator
[TABLE]
and the discretisation reads as
[TABLE]
Therefore, the discrete weighted divergence is
[TABLE]
This leads to the discrete adjointness property, for every :
[TABLE]
where and .
When considering higher order derivatives for a generic order , the adjoint formula Eq. 25 is slightly more complicated due to the recursive definition of the weighted gradient and the location of the nested multiplication. Indeed by Definition 2.11 we have for a fixed , whose finite-dimensional approximation is formally written as via the recursion rule
[TABLE]
The finite-dimensional approximation of the adjoint is denoted with and defined via the recursion rule
[TABLE]
Remark 3.4**.**
*Note that in Eq. 26 for we omitted the inverse transfer operator so as to force the highest derivative to be located in and match with the position of . For the same reason, in Eq. 27 we omitted the transfer operator for so as to match the quantity with in the grid . This operation is performed in view of the adjointness property stated below in Eq. 28. *
For every , the discrete adjointness property holds:
[TABLE]
where and .
4 Numerical optimisation
In what follows, we solve in the first instance the single line version of Eq. 13, namely for a fixed , a fixed , a fixed collection of weighting matrices and the operator associated to the problem to solve, we aim to tackle the problem
[TABLE]
by means of a primal-dual hybrid gradient method [15, 16] and following [11, Equation 4.4]. With all discrete objects in place, we have
[TABLE]
where , is the discretized weighted divergence w.r.t. the weights and the transfer operators , is the discrete version of defined as
[TABLE]
and is the discretization of in Eq. 20, defined as
[TABLE]
4.1 Discrete characterisation of TDV
For a fixed , the regulariser can be characterised as follows. From the discrete version of in [50, Section 4.1] and following the characterization of in [11, Remark 3.8 and Remark 3.10], we can write the equivalent discrete definition of for and as
[TABLE]
where
[TABLE]
Indeed, in the following let , and . We call
[TABLE]
where as in Eq. 27 and as in Eq. 30. Note that the sup is finite by definition of . Thus and we define
[TABLE]
With and -times integration by parts, the functional becomes
[TABLE]
where is the characteristic function with values either [math] or , and where the adjoint is the same as in [11, Remark 3.9]. By Fenchel duality for the operator we have:
[TABLE]
Iterating the procedure for and by the identity
[TABLE]
we get
[TABLE]
and thus, with as in Eq. 32, we conclude
[TABLE]
A continuous version of Eq. 31 also holds. This is proved in the second part of this work [42].
The characterisation of in Eq. 31 is fundamental for writing a suitable primal-dual algorithm for the minimization problem in Eq. 13.
4.2 Discretised single minimization problems
Let be a given discrete imaging data. For a fixed order , let be decomposed as in Eq. 31, with , and the denoted in the discrete setting by for a generic tensor-valued object , with each .
The discrete single minimization problems, for defined as in Section 3.2 read as:
- •
for order , , , :
[TABLE]
- •
for order , , , :
[TABLE]
- •
for order , , , :
[TABLE]
For a fixed , we aim to provide a more concise formulation of Eqs. 33, 34 and 35. Let be as above and be a matrix of operators associated to and defined as
[TABLE]
i.e. with if , and as in Eq. 32 for each . Then, solving Eq. 29 is equivalent to solving for , with , the problem:
[TABLE]
By duality of the norm and recalling that is the dual vector defined in Section 3.2, we rewrite Eq. 37 into a saddle-point minimization problem:
[TABLE]
or, in short notation:
[TABLE]
where is a partially strongly convex term, since it can be seen as for the projection of onto the subspace of and with being strongly convex, and where
[TABLE]
and
[TABLE]
and .
4.3 Proximal operators
We aim to solve the saddle point problem Eq. 39 with a Primal-Dual Hybrid Gradient (PDHG) algorithm. We need the proximal operators of and .
The proximal map of evaluated at a point is the sum of the projections onto the respective polar balls since is fully separable111meaning that is a function that can be written as a sum of functions in disjoint sets of variables.,
[TABLE]
The proximal map of should be evaluated at a point . Recalling that and that is as in Section 3.2, we have:
[TABLE]
Let us focus on the first component of , in Eq. 40: we have to solve
[TABLE]
whose minimum is achieved by a that solves, for adjoint of ,
[TABLE]
Thus, the first component of , that is , is
[TABLE]
Note that for the Rudin-Osher-Fatemi denoising problem we have , thus , and the proximal map agrees with the one computed in [15, pag. 133].
4.4 Operator norm
Following the approach in [11, Section 4] and [15, Section 6.1], we estimate a bound on the norm of the linear operator in Eq. 26 in view of the implementation of a suitable primal-dual algorithm. Let be the operator norm, then for each we have
[TABLE]
In the two-dimensional setting, when then in Eq. 41 reduces to
[TABLE]
and by applying the finite difference scheme in Eq. 24, from we estimate:
[TABLE]
For a fixed , since it holds
[TABLE]
then the operator norm is estimated via
[TABLE]
Remark 4.1**.**
Since is made by partition of unit transfer operators and by construction, we can estimate the right-hand side of Eq. 42 as:
[TABLE]
*which agrees with the classic isotropic setting given by the choices without the use of for every . Indeed, we have for and for . *
4.5 Primal-Dual Hybrid Gradient algorithm
Now we are ready for solving Eq. 39 with a Primal-Dual Hybrid Gradient (PDHG) algorithm following [15]. Let be fixed and be the operator norm as in Section 4.4, i.e. and let , be fixed, such that . Then the PDHG algorithm [15, Algorithm 1] reads as the iteration of
[TABLE]
where we denoted with an index the iterations, starting from admissible and . The final solution is achieved by . Compare Algorithm 1 for details.
Acceleration
If and only the first order regulariser is involved () then the fidelity term is strongly convex with convexity parameter (since it does not involve the terms of related to the derivatives of order greater than 1) and the dual problem is smooth. Therefore, it is possible to accelerate the PDHG algorithm with [15, Algorithm 2]: we can take and update by taking as the Lipschitz constant of , and , with the update rule in Eq. 43 before reading as
[TABLE]
When and then is only partially strongly convex and one can use either [15, Algorithm 1] or the acceleration proposed in [55]. In any case, when makes not strongly convex then the use of [15, Algorithm 1] is recommended: in such case, and are fixed a-priori, e.g. the authors in [32] adopted the parameters for the second-order regulariser and for a grid of grid-size .
4.6 Primal-Dual Gap
As exit condition for the primal-dual algorithm of the minimisation problem, it is possible to either define a maximum number of iterations to reach or to impose a threshold for the primal-dual gap, defined for the current solutions and as
[TABLE]
4.7 Joint minimisation problem
In the continuous setting and for any fixed , the particular choice of and deserves a separate discussion. Indeed, the decomposition of in the continuous setting results as
[TABLE]
in which the sparsity of the inner order of derivatives would not be fully exploited due to the weight . The above decomposition is equivalent to impose for any , resulting in . As discussed in Remark 3.4, in this case the usage of the transfer operators in the discrete setting and for the inner order of derivates is useless and we can therefore discretise the above regulariser as
[TABLE]
In our applications, we will mainly focus on the effect of weighting the highest order derivative by means of taking the exact inner derivatives, but combined jointly with regularisers for different , i.e. we aim to solve in the discrete setting
[TABLE]
with and for each , leading to
[TABLE]
Then it is possible to reduce Algorithm 1 to Algorithm 2, where the accelerated PDHG [15, Algorithm 2] can be used for any choice of that makes strongly convex.
5 Imaging Denoising
In what follows we demonstrate the performance of the introduced regulariser for the applications of image denoising. We focus in particular on the cases for the single Eq. 29 and joint minimisation model Eq. 45. Results are computed on a standard laptop (MATLAB R2019a, MacBook Pro 13'', 2.9 GHz Intel Core i5, 8 GB 1867 MHz DDR3). The code is freely available at the authors' webpage222Code freely available at https://github.com/simoneparisotto.
Let , be a grey-scale image (colour images are considered one colour channel at time), a field with and a given noisy image.
5.1 Estimation of vector field
For estimating we use the following strategy. Let . Let be such that , the ordered eigenvalues of
[TABLE]
and the associated eigenvectors. Let be the local direction of the anisotropy, corresponding to an approximation of . In order to compute a vector field smoother than , we adopt a further regularisation step, similarly as in [34].
Let . We aim to smooth the vector field where the anisotropy weight is close to 0 while keeping the already computed vector field in regions with strong anisotropy. This is equivalent to solving the following problem:
[TABLE]
We use the local estimation of the anisotropy as weights , for :
[TABLE]
We can use also to vary locally : we have already seen that the process is more isotropic as is closer to 1. For this reason, a possible strategy to vary is:
- •
first, estimate the anisotropy (values close to 1 correspond to isotropic regions) by:
[TABLE]
- •
second, rescale in to define :
[TABLE]
With this strategy, the higher the image anisotropy the closer is to : in such cases strong directional structures are emphasised by our directional regulariser. Conversely, when , isotropic smoothing is performed in flat regions. In the following, we will require that for every , thus
[TABLE]
where is a given estimated field. Also, we may refine by updating the parameters so as to restart the denoising problem with a better estimation of the vector field.
5.2 Single minimisation model
Here we describe results with the single model and fixed to be chosen between and . We will explore all the possible combinations for , where each will be either the identity or the anisotropic weights.
Numerical results
In Fig. 4 we report the results from the single problem in Eq. 29, with fixed and different choices for the anisotropy . To compute , we employ the spatially varying strategy based on the structure tensor eigen-decomposition described in Section 5.1, with and , where follows from Eqs. 48, 49 and 50.In these experiments, we keep fixed , and for and we run Algorithm 1 for . We compare our results with (with the same weights) from an online source333Code available at www.gipsa-lab.grenoble-inp.fr/~laurent.condat/download/TGVdenoise.m: our strategy is able to encode spatially varying directional information at all the derivative orders.
For the experiments of Fig. 5, we use the same anisotropy directions as in the experiments of Fig. 4, but we modulate the anisotropy weight to test its impact on the result. More precisely, starting from the same anisotropy matrix of Fig. 4, we define two variants: is obtained from by replacing with , and is obtained from by replacing with . The results of Fig. 5 correspond to different anisotropic weighting of the first and second-order derivatives in the model , see the figure caption for details. Interestingly, a better PSNR is obtained with two orders of derivation and the following strategies: if the first derivative is anisotropically weighted, it is better to opt for an isotropic weight of the second-order derivative. And if the first derivative is isotropically weighted, it is better to opt for a strong anisotropic weight of the second-order derivative. This latter case seems the best in terms of PSNR but, of course, sharper results are obtained when anisotropic weights are chosen. Obviously, the PSNR metric does not capture all aspects of an image and a visual evaluation by the end user remains necessary for the choice of the most suitable combination of derivative orders and anisotropic weights.
Remark 5.1**.**
*In our experiments we observed some checkerboard artifacts when employing strong anisotropies. This phenomenon has been attributed to spectral properties of finite differences [59, 25] and a non-negative stencil avoiding these issues has been introduced in [25] for the case of directional Hessian. We leave a generalization of [25] to our higher-order case for future work. The artifacts are not evident in the results of the next Section 5.3, where the joint model and a specific choice of the weights help in producing better quality results. *
Our approach has a reasonably large number of parameters depending on the noise level and the structural imaging information: the order of the derivatives, the spatially varying directional information from the structure tensor and the weights for the sparsity of the derivatives. This makes the model highly customisable, but also quite sensitive to parameters. In order to simplify the parameter choice, we detail in Section 5.3 a simplified joint model with a suggested choice of parameters, particularly suited for images with large dominant structures, as well as for the reconstruction of images from scattered data, see Section 6.2.
5.3 Joint minimisation model
Following the comments in Remark 5.1, we now consider the joint model in Eq. 13 and described in Section 4.7 for , with and for each . Since we consider the denoising problem, we will make use of the identity operator in the fidelity term, aiming to solve:
[TABLE]
Thus, the denoising problem in equation Eq. 51 can be simplified as
[TABLE]
and solved via the primal-dual Algorithm 2 in which the computations for and are performed alternatingly as described in Algorithm 3 so as to update the estimation of the directions . This updating strategy will allow a refinement for the computation of the directionality in images.
Numerical Results
We discuss denoising results obtained with Algorithm 3 and different denoising approaches (the non-local method BM3D with normal-complexity profile and prior knowledge of the standard deviation of the Gaussian noise [19] and the regularisers [45], [11], and [20]) for images with strong directional features. We run the primal-dual Algorithm 2 for iterations and we restarted the Algorithm 3 once the first denoised image is computed which serves as an oracle for a better estimation of . We show results for grey-scale images in Fig. 8 and for colour images Figs. 10 and 11 and we discuss their PSNR.
Bamboo image
The grey-scale image in Fig. 6(a) shows a strong directional direction. In Fig. 6(b) it has been corrupted by 20% of Gaussian noise using the same random seed as in [20], see Fig. 6(b). In Fig. 6 we report the results from state of the art approaches, as reported in [20], where and are considered with a single fixed choice of anisotropy direction and minor semi-axis (our parameter ). In Figs. 6(d) and 6(e) the staircasing effect is visible as expected while Fig. 6(f) and Fig. 6(g) seem more promising, even if obtained with a single fixed direction only.
In our approach we vary the spatial directions estimating the vector field as described in Eq. 46–Eq. 50 while we fix , so as to fix the elliptic shape of the test functions. First, in Fig. 7 we report the sensitivity to the parameter (with step-size increment of ) for the first order regulariser: from our experiments produces a better result than the first order regularisers, in Fig. 6(d) and in Fig. 6(f), and the second order in Fig. 6(e), showing less staircasing and directional artefacts. In this test, we fixed a-priori a number of parameters, i.e. , and for the anisotropic structure in building and in Algorithm 3.
In Fig. 8 we report the best results obtained with the same fixed choice of parameters but now with all the possible combinations of first, second and third order regularisers, as well as the sketch of the streamlines of in Fig. 8(a). We observed that the combination of the first and third order of regularisers outperforms the results from Fig. 6. In the next paragraph we comment about further choices for the parameters.
Selection of parameters for the bamboo image
Let the choice of the regulariser orders and the maxiter in Algorithm 3 be fixed. We would like to estimate good parameters that directly affect the image reconstruction, namely the fidelity , the directional information provided by of the structure tensor and from for the anisotropic shape of the test functions (assumed for now fixed all over the imaging domain). Unfortunately, the structure tensor depends on the intrinsic image content and standard deviation of the noise.
However, for (among the tested) we obtained the best result for the combination of the first and third order regularisers, i.e. . Therefore we inspect more the PSNR obtained by fixing and by changing both and in the range between : results in Fig. 9 show that the optimal parameters for this case are the ones producing Fig. 8(g) as output. Other strategies for estimating the parameters, including , would require to solve a bilevel problem, e.g. as in [21], or to solve the problem by updating the parameters with a greedy line-search approach onto many directional images, so as to extract a rule of thumb for the choice, e.g. as in [43].
A natural question now is whether allowing also to change in improves the performances. We are going to answer this question in the next two experiments. However, since the scale of the directional texture in the next experiments are spatially different, we fix and focus on the promising case of combination of first and third order regularisers, i.e. , with reasonable choice of .
Rainbow image
The rainbow in Fig. 10(a) has been corrupted by of Gaussian noise in each color channel, see Fig. 10(b). Due to the particular structure of the image, an isotropic approach seems reasonable outside the rainbow while an anisotropic approach inside. This resulted in varying the parameter following equations Eq. 49-Eq. 50: in Fig. 10(d) the black pixels correspond to and the white pixels to . Indeed for we expect to denoise the image following the anisotropy induced by , while for we expect to denoise the image isotropically in both and directions. In order to compute the vector field as in Fig. 10(e), we did not apply the regularisation step Eq. 47 and we did not iterate Algorithm 3 since both the resulting and seem good enough for our purposes, performing better than BM3D in Fig. 10(c), with less wavy artefacts and a smoother global structure.
Desert image
The desert image in Fig. 11(a) is a mix of anisotropic and isotropic information. We denoised Fig. 11(b), corrupted again with of Gaussian noise in each color channel, with and to estimate in Fig. 11(e) as described before. We also allowed to vary across the domain and we did not refine with further iterations. Here, BM3D in Fig. 11(c) performed slightly better than our approach due to the wrong estimation of along some dune waves: this is clearly visible in Figures 11(d) and 11(e) where both the wrong estimation of and the isotropy requirement of result in a smoothing performance, as shown in the zooms of Figures 11(g) and 11(h). However, we recall that BM3D is a non-local method and better performances than local methods are expected. Nevertheless, since this kind of images have patterns and structures at different scales such that they cannot be captured by a single global structure tensor, we expect that the reconstruction quality can be improved, e.g. by a structure tensor workflow with locally adapting parameters .
6 Other imaging applications with the joint model
In what follows we focus on the joint minimisation model Eq. 13 for the applications of image zooming and surface interpolation.
6.1 Wavelet Zooming
In this section we apply our regularisation to wavelet-based image zooming as in [10]. Here, the data fidelity term is modelled by a wavelet transformation operator. Let , be the scaling and mother wavelet function, respectively. Then, a Riesz basis of is obtained from translations and rotations of and . Here, we will consider only functions with compact support, yielding a compactly supported basis elements. Let be a resolution level and be finite index sets in , then:
- •
a Riesz basis of is , ;
- •
the dual Riesz basis of the above is defined as , .
Thus, the following decomposition holds:
[TABLE]
Let be a low resolution version of given by , where the unknown is such that and
[TABLE]
The wavelet-based zooming problem with higher order total directional regularisers reads as
[TABLE]
where and is the convex indicator function w.r.t. , see [10] for more details. Since we did not downsample the original image, we avoided artefacts introduced by algorithms for reducing the image but at the same time no ground truth is available. Results based on CDF 9/7 wavelet are shown in Fig. 12 (grey-scale) and in Fig. 13 (colour).
6.2 Surface Interpolation
In this experiment, we aim to reconstruct a surface from scattered height data available in . The available data lies on partially occluded isolines or on random points in and the challenge is to interpolate them by preserving the anisotropic features via the reconstruction of a suitable vector field . Before presenting our approach for this problem using , we briefly review the state-of-the-art for surface interpolation.
Related works
The reconstruction of surfaces from scattered height values has been approached in two different ways in the literature: based on explicit and implicit models. Surface interpolation is sometimes also addressed as digital elevation map (DEM) problem.
In this paper we focus on implicit surface interpolation which has the advantage of being independent with respect to parametrization. Here the surface is an implicit function of height values over the domain. Two prominent methods in this range are the Thin Plate Spline (TPS) [36] and the Absolute Minimizing Lipschitz Extension (AMLE) [2] approach. TPS is a flexible approach since it can embed both grey values and gradient information. However, it has the drawback to be a fourth order isotropic method and the resulting interpolated surface is isotropically smooth. AMLE, on the other hand, is able to interpolate data given in isolated points and on curves but it fails to interpolate slopes of a surface, resulting in , see [47].
For interpolating surfaces with sharp features, e.g. strong creases, and possibly non-smooth features, e.g. corners in a pyramid, it seems promising therefore to consider (higher-order) total variation () regularisers for surface interpolation.
Our main model approach here is [34], where a third-order directional total variation regulariser has been proposed that reads for a given vector field as
[TABLE]
where is the directional derivative of the Hessian of , along . Note that this is a special case of with , , , , i.e. and , leading to .
The estimation of is crucial to obtain a good quality result. In [34], has been computed as a two step minimization-regularisation problem by solving firstly
[TABLE]
and then applying to the same regularisation step in Eq. 47, where is a weight chosen as the largest singular value of and is a regularising parameter smoothing the vector field where is almost planar and preserving the local values of for level lines of large curvature. As last step, is normalised to be unitary.
Another directional interpolation model for and appears in [8]: differently from our approach in this paper, it requires knowledge of the vector field prior to the interpolation.
In this section, we generalize the approach of [34] for the reconstruction of a surface, given scattered height values lying (possibly) on partial contour lines. Differently from Section 5, the unitary vector field is reconstructed in the missing domain as follows.
Let be a 2D domain () and be sparse sampled height values. In the following, the projection onto the data available is identified by the operator . We aim to find the interpolated surface , driven by the unitary directions . Let be a collection of weighting fields, where for a fixed each collection is defined as with explicit dependence on . We solve by Algorithm 4, alternatingly:
[TABLE]
with the primal-dual in Algorithm 2 for Eq. 54 and a classic primal-dual for Eq. 55. In particular, in Eq. 55 we identify for regularising the vector field and for normalising in the direction of the normalised gradient [4].
Minimization with respect to
Fixing an unitary vector field , the minimization problem Eq. 54 is convex with respect to and the minimization problem can be solved via primal-dual Algorithm 2 without acceleration due to the lack of strong convexity of the projection map , which results in a non-smooth dual problem.
Minimization with respect to
Fixing , the minimization problem Eq. 55 can be solved by the primal-dual algorithm with
[TABLE]
Let , and . Then, the proximal of , with , is the projection onto the polar ball:
[TABLE]
The proximal map of at , for , reads as
[TABLE]
thus
[TABLE]
Since is strongly-convex, we use the accelerated scheme Eq. 43, with instead of .
Numerical Results
We tested Algorithm 4 in MATLAB on synthetic and real surfaces. Differently from [34], we did not use CVX or MOSEK, making our approach suitable for larger surfaces, beyond the variable size limit imposed by CVX. In what follows, we will use Eq. 52 for solving Eq. 54 and we will test both single and joint directional regularisers, namely , and , with and to be chosen appropriately for the situation. For a better visualization of the results, a divergence RGB colormap in the range has been applied.
Pyramid dataset from [34]
A pyramid with height data available on three contour lines and no extra information on the tip is given, so as to test whether our model can reconstruct it. We initialize and randomly. In Fig. 14, we report in the first column the location of the available data (top) and the ground truth (bottom); in the second column the random initialization of (top) and (bottom); in the third, fourth and fifth columns we report the results from different orders of directional regularisers, namely , and , with one level of anisotropy . The similarity of the resulting vector fields in Fig. 14, despite the different derivative orders involved in the minimisation with respect to , shows the robustness of the computation of for such problem. Visual results suggest that a combination between second and third order directional regularisers, e.g. , is desirable since it smooths the second-order result without loosing its features.
SRTM dataset from [54]
This dataset is part of the Shuttle Radar Topography Mission (SRTM) [54] NASA mission so as to obtain elevation data for most areas of the world. We download .hgt ``height'' binary data files from [6], where by selection of latitude and longitude coordinates we get 1x1 degree tiles of 1-arc seconds resolution (around per pixel). We selected some famous mountains within Italy in Fig. 15: Etna volcano (Sicily), Baldo Mountain (Verona), Vesuvio volcano (Naples), Brenner border (Sterzing) and Gran Sasso (L'Aquila), with image size domain of pixels. As input, we randomly sampled about of sparse data on level lines and isolated points, to be interpolated with .
Atomic Force Microscopy dataset from [44]
Atomic force microscopy (AFM), or scanning probe microscopy (SPM), is a topography imaging technique commonly used in the detection of cancer cells in cellular biology: it scans objects at high resolution while recording their topographical information. In [39], the study of a compressed sensing approach on AFM images was motivated by the reduction of the image acquisition time for multiple reasons, e.g. to minimize the operator time spent at the equipment [28], to allow time-dependent dynamic processes [48] and to minimize the interaction of instruments with specimens so as to reduce potential risks of damages [37]. Therefore, the authors proposed to speed up the sampling procedure by scanning height data on spirals rather than exploring pixel by pixel, so as to reconstruct the missing data via compressed sensing. The authors define the under-sampling ratio as , where is the length of the spiral path followed by the probe for acquiring the data and is the distance travelled by the probe in pixels during the raster scan. Note that also counts the path outside the imaging domain due to smooth movement requirement of the probe, while is approximated by the value and the factor of two is due by the usual approach to acquire two topography buffers, even if only one is used for the visualization. In order to test our reconstruction method based on the directional regularisers, we downloaded the open source AFM .mi dataset of height values from [44], exported in ASCII text via the open-source software Gwyddion and imported in MATLAB. Our input are AFM surfaces of size obtained by slicing the orginal surface, for comparison purposes following [39]. We show the results in Fig. 16 for the ground truth image in Fig. 16(a), with different under-sampling ratio , see Figs. 16(c) and 16(d). In Fig. 16(b) we compare the structural similarity index (SSIM) [56] for our results (producing the black line of scores on the top of the graph) with [39, Figure 7], where iterative hard thresholding (IHT), iterative soft tresholding (IST), their weighted version (w-IHT and w-IST) and Basis Pursuilt Denoising (BPDN) were tested: we conclude that our approach is robust throughout different under-sampled data, with good quality surfaces in terms of SSIM.
7 Conclusions
In this work, we have shown that embedding anisotropic directional information into higher order derivatives improves the performance of total variation regularisation in many imaging applications where anisotropy plays a crucial role. In particular, we presented results for image denoising, image zooming and interpolation of scattered measurements, with details on the numerical discretisation and the solution via a primal-dual hybrid gradient algorithm. Among the range of experiments provided, we emphasise that our approach is particularly suitable for the reconstruction task from scattered data, motivating the interest in studying the proposed energy. With this we provided a precise discrete framework which extends the works [34, 20, 11, 10], bringing higher-order total variation together with spatially-varying anisotropy. The continuous model is analysed in the companion paper [42],
Acknowledgements
The authors are grateful to Dr. Martin Holler, University of Graz (Austria) for the useful discussions and to Prof. Thomas Arildsen, Aalborg University (Denmark) for the AFM data.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] T. Aach, C. Mota, I. Stuke, M. Muhlich, and E. Barth , Analysis of Superimposed Oriented Patterns , IEEE Transactions on Image Processing, 15 (2006), pp. 3690–3700, https://doi.org/10.1109/TIP.2006.884921 . · doi ↗
- 2[2] A. Almansa, F. Cao, Y. Gousseau, and B. Rouge , Interpolation of digital elevation models using AMLE and related methods , IEEE Transactions on Geoscience and Remote Sensing, 40 (2002), pp. 314–325, https://doi.org/10.1109/36.992791 . · doi ↗
- 3[3] L. Alvarez, F. Guichard, P.-L. Lions, and J.-M. Morel , Axioms and fundamental equations of image processing , Archive for Rational Mechanics and Analysis, (1993).
- 4[4] C. Ballester, M. Bertalmio, V. Caselles, G. Sapiro, and J. Verdera , Filling-in by joint interpolation of vector fields and gray levels , IEEE Transactions on Image Processing, 10 (2001), pp. 1200–1211, https://doi.org/10.1109/83.935036 . · doi ↗
- 5[5] I. Bayram and M. E. Kamasak , Directional total variation , IEEE Signal Processing Letters, 19 (2012), pp. 781–784, https://doi.org/10.1109/LSP.2012.2220349 . · doi ↗
- 6[6] F. Beauducel , READHGT: Import/download NASA SRTM data files (.HGT) , 2012, https://www.mathworks.com/matlabcentral/fileexchange/36379 .
- 7[7] B. Berkels, M. Burger, M. Droske, O. Nemitz, and M. Rumpf , Cartoon Extraction Based on Anisotropic Image Classification , in Vision, Modeling, and Visualization Proceedings, 2006, pp. 293–300, http://numod.ins.uni-bonn.de/research/papers/public/Be Bu Dr 06.pdf .
- 8[8] T. R. Bin Wu and X.-C. Tai , Sparse-data based 3D surface reconstruction for cartoon and map , Internal report, UCLA, (2017), ftp://ftp.math.ucla.edu/pub/camreport/cam 17-38.pdf .
