Approximation of the Mumford-Shah Functional by Phase Fields of Bounded   Variation

Sandro Belz; Kristian Bredies

arXiv:1903.02349·math.AP·September 27, 2021

Approximation of the Mumford-Shah Functional by Phase Fields of Bounded Variation

Sandro Belz, Kristian Bredies

PDF

TL;DR

This paper introduces a novel phase field approximation of the Mumford-Shah functional using bounded variation functions, enabling sharper image segmentation results compared to traditional methods.

Contribution

The paper proposes a new phase field model based on BV functions for Mumford-Shah approximation, enhancing image segmentation quality.

Findings

01

The BV-based approximation produces sharper phase fields.

02

Numerical methods incorporate total variation minimization.

03

Comparison shows improved segmentation sharpness.

Abstract

In this paper we introduce a new phase field approximation of the Mumford-Shah functional similar to the well-known one from Ambrosio and Tortorelli. However, in our setting the phase field is allowed to be a function of bounded variation, instead of an $H^{1}$ -function. In the context of image segmentation, we also show how this new approximation can be used for numerical computations, which contains a total variation minimization of the phase field variable, as it appears in many problems of image processing. A comparison to the classical Ambrosio-Tortorelli approximation, where the phase field is an $H^{1}$ -function, shows that the new model leads to sharper phase fields.

Tables1

Table 1. Table 1. Numerical parameters

$α$	$β$	$γ$	$θ$	$T o l_{1}$	$T o l_{2}$	$M a x I t$
$1.75 \cdot 10^{- 4}$	$1$	$3 \cdot 10^{- 5}$	$0.99$	$10^{- 3}$	$10^{- 5}$	$10000$

Equations290

\frac{α}{2} \int_{Ω} ∣ \nabla u ∣^{2} d x + \frac{β}{2} \int_{Ω} ∣ u - g ∣^{2} d x + γ H^{1} (Γ)

\frac{α}{2} \int_{Ω} ∣ \nabla u ∣^{2} d x + \frac{β}{2} \int_{Ω} ∣ u - g ∣^{2} d x + γ H^{1} (Γ)

\frac{α}{2} \int_{Ω} ∣ \nabla u ∣^{2} d x + \frac{β}{2} \int_{Ω} ∣ u - g ∣^{2} d x + γ H^{1} (S_{u})

\frac{α}{2} \int_{Ω} ∣ \nabla u ∣^{2} d x + \frac{β}{2} \int_{Ω} ∣ u - g ∣^{2} d x + γ H^{1} (S_{u})

MS (u) : - \frac{α}{2} \int_{Ω} ∣ \nabla u ∣^{2} d x + γ H^{1} (S_{u}) for all u \in GSBV (Ω) .

MS (u) : - \frac{α}{2} \int_{Ω} ∣ \nabla u ∣^{2} d x + γ H^{1} (S_{u}) for all u \in GSBV (Ω) .

A T_{ε} (u, v) = \int_{Ω} (v^{2} + η_{ε}) ∣ \nabla u ∣^{2} d x + \int_{Ω} \frac{1}{4 ε} (1 - v)^{2} + ε ∣ \nabla v ∣^{2} d x

A T_{ε} (u, v) = \int_{Ω} (v^{2} + η_{ε}) ∣ \nabla u ∣^{2} d x + \int_{Ω} \frac{1}{4 ε} (1 - v)^{2} + ε ∣ \nabla v ∣^{2} d x

\int_{Ω} (v^{2} + η_{ε}) ∣ \nabla u ∣^{2} d x + \frac{1}{2 ^{p^{'}} ε} \int_{Ω} (1 - v)^{p^{'}} d x + ε^{p - 1} \int_{Ω} ∣ \nabla v ∣^{p} d x

\int_{Ω} (v^{2} + η_{ε}) ∣ \nabla u ∣^{2} d x + \frac{1}{2 ^{p^{'}} ε} \int_{Ω} (1 - v)^{p^{'}} d x + ε^{p - 1} \int_{Ω} ∣ \nabla v ∣^{p} d x

\frac{α}{2} \int_{Ω} (v^{2} + η_{ε}) ∣ \nabla u ∣^{2} d x + \frac{γ}{2 ε} \int_{Ω} (1 - v) d x + \frac{γ}{2} ∣ D v ∣ (Ω)

\frac{α}{2} \int_{Ω} (v^{2} + η_{ε}) ∣ \nabla u ∣^{2} d x + \frac{γ}{2 ε} \int_{Ω} (1 - v) d x + \frac{γ}{2} ∣ D v ∣ (Ω)

F (u) \leq j \to \infty lim inf F_{j} (u_{j}) .

F (u) \leq j \to \infty lim inf F_{j} (u_{j}) .

j \to \infty lim sup F_{j} (u_{j}) \leq F (u) .

j \to \infty lim sup F_{j} (u_{j}) \leq F (u) .

j \to \infty Γ-lim inf F_{j} (u)

j \to \infty Γ-lim inf F_{j} (u)

j \to \infty Γ-lim sup F_{j} (u)

\int_{Ω} u div w d x = - \int_{Ω} w dD u for all w \in C_{c}^{1} (Ω; R^{n}) .

\int_{Ω} u div w d x = - \int_{Ω} w dD u for all w \in C_{c}^{1} (Ω; R^{n}) .

V(u,\Omega)=\sup\biggl{\{}\int_{\Omega}u\operatorname{div}w\,\mathrm{d}x:w\in C_{c}^{1}(\Omega;\mathbb{R}^{n}),\lVert w\rVert_{\infty}\leq 1\biggr{\}}

V(u,\Omega)=\sup\biggl{\{}\int_{\Omega}u\operatorname{div}w\,\mathrm{d}x:w\in C_{c}^{1}(\Omega;\mathbb{R}^{n}),\lVert w\rVert_{\infty}\leq 1\biggr{\}}

u^{+} (x)

u^{+} (x)

u^{-} (x)

H^{+} (x)

H^{+} (x)

H^{-} (x)

V\bigl{(}u,(a,b)\bigr{)}=\sup\Biggl{\{}\sum_{i=1}^{N}\bigl{\lvert}\tilde{u}(t_{i})-\tilde{u}(t_{i-1})\bigr{\rvert}:N\in\mathbb{N},a<t_{0}<\dots<t_{N}<b\Biggr{\}}\,.

V\bigl{(}u,(a,b)\bigr{)}=\sup\Biggl{\{}\sum_{i=1}^{N}\bigl{\lvert}\tilde{u}(t_{i})-\tilde{u}(t_{i-1})\bigr{\rvert}:N\in\mathbb{N},a<t_{0}<\dots<t_{N}<b\Biggr{\}}\,.

D u = D^{a} u + D^{j} u + D^{c},

D u = D^{a} u + D^{j} u + D^{c},

SBV^{p} (Ω)

SBV^{p} (Ω)

GSBV^{p} (Ω)

Ω_{y}^{ξ} : - {t \in R : y + t ξ \in Ω} for all y \in Ω_{ξ} .

Ω_{y}^{ξ} : - {t \in R : y + t ξ \in Ω} for all y \in Ω_{ξ} .

\int_{\Omega_{\xi}}\bigl{\lvert}\mathrm{D}u_{y}^{\xi}\bigr{\rvert}\bigl{(}\Omega_{y}^{\xi}\bigr{)}\mathrm{d}\mathcal{L}^{n-1}(y)<\infty\,.

\int_{\Omega_{\xi}}\bigl{\lvert}\mathrm{D}u_{y}^{\xi}\bigr{\rvert}\bigl{(}\Omega_{y}^{\xi}\bigr{)}\mathrm{d}\mathcal{L}^{n-1}(y)<\infty\,.

\int_{\Omega_{\xi}}\bigl{\lvert}\mathrm{D}\bigl{(}(-M)\vee u_{y}^{\xi}\wedge M\bigr{)}\bigr{\rvert}\bigl{(}\Omega_{y}^{\xi}\bigr{)}\mathrm{d}\mathcal{L}^{n-1}(y)<\infty\quad\text{for all }M>0\,.

\int_{\Omega_{\xi}}\bigl{\lvert}\mathrm{D}\bigl{(}(-M)\vee u_{y}^{\xi}\wedge M\bigr{)}\bigr{\rvert}\bigl{(}\Omega_{y}^{\xi}\bigr{)}\mathrm{d}\mathcal{L}^{n-1}(y)<\infty\quad\text{for all }M>0\,.

f^{\ast}(y)=\sup_{x\in\mathbb{R}}\bigl{(}\langle x,y\rangle-f(x)\bigr{)}\quad\text{for all }s\in\mathbb{R}^{n}

f^{\ast}(y)=\sup_{x\in\mathbb{R}}\bigl{(}\langle x,y\rangle-f(x)\bigr{)}\quad\text{for all }s\in\mathbb{R}^{n}

⟨ x, y ⟩ \leq f (x) + f^{*} (y) for all x, y \in R^{n} .

⟨ x, y ⟩ \leq f (x) + f^{*} (y) for all x, y \in R^{n} .

\partial f (x) = {z \in R^{n} : f (x) - f (y) \leq ⟨ z, x - y ⟩ for all y \in R^{n}} for all x \in R^{n} .

\partial f (x) = {z \in R^{n} : f (x) - f (y) \leq ⟨ z, x - y ⟩ for all y \in R^{n}} for all x \in R^{n} .

F_{\varepsilon}(u,v)\coloneq\int_{\Omega}\bigl{(}f(v)+\eta_{\varepsilon}\bigr{)}\lvert\nabla u\rvert^{2}\,\mathrm{d}x+\int_{\Omega}\varphi_{\varepsilon}\bigl{(}W_{\varepsilon}(v)\bigr{)}+\psi_{\varepsilon}\bigl{(}\lvert\nabla v\rvert\bigr{)}\,\mathrm{d}x\\ +c_{\varepsilon}\bigl{(}\lvert\mathrm{D}^{j}v\rvert(\Omega)+\lvert\mathrm{D}^{c}v\rvert(\Omega)\bigr{)}

F_{\varepsilon}(u,v)\coloneq\int_{\Omega}\bigl{(}f(v)+\eta_{\varepsilon}\bigr{)}\lvert\nabla u\rvert^{2}\,\mathrm{d}x+\int_{\Omega}\varphi_{\varepsilon}\bigl{(}W_{\varepsilon}(v)\bigr{)}+\psi_{\varepsilon}\bigl{(}\lvert\nabla v\rvert\bigr{)}\,\mathrm{d}x\\ +c_{\varepsilon}\bigl{(}\lvert\mathrm{D}^{j}v\rvert(\Omega)+\lvert\mathrm{D}^{c}v\rvert(\Omega)\bigr{)}

F (u, v) : - ⎩ ⎨ ⎧ \int_{Ω} f (1) ∣ \nabla u ∣^{2} d x + 2 c_{0} H^{n - 1} (S_{u}) + \infty for u \in GSBV^{2} (Ω), v = 1 a.e., otherwise.

F (u, v) : - ⎩ ⎨ ⎧ \int_{Ω} f (1) ∣ \nabla u ∣^{2} d x + 2 c_{0} H^{n - 1} (S_{u}) + \infty for u \in GSBV^{2} (Ω), v = 1 a.e., otherwise.

G_{ε} (u, v) : - ⎩ ⎨ ⎧ F_{ε} (u, v) + \frac{β}{2} \int_{Ω} ∣ u - g ∣^{2} d x + \infty for u \in H^{1} (Ω), v \in BV (Ω; [0, 1]), otherwise.

G_{ε} (u, v) : - ⎩ ⎨ ⎧ F_{ε} (u, v) + \frac{β}{2} \int_{Ω} ∣ u - g ∣^{2} d x + \infty for u \in H^{1} (Ω), v \in BV (Ω; [0, 1]), otherwise.

G (u, v) : - ⎩ ⎨ ⎧ F (u, v) + \frac{β}{2} \int_{Ω} ∣ u - g ∣^{2} d x + \infty for u \in SBV^{2} (Ω) \cap L^{\infty} (Ω), v = 1 a.e., otherwise.

G (u, v) : - ⎩ ⎨ ⎧ F (u, v) + \frac{β}{2} \int_{Ω} ∣ u - g ∣^{2} d x + \infty for u \in SBV^{2} (Ω) \cap L^{\infty} (Ω), v = 1 a.e., otherwise.

G (u, v) \leq ε \to 0 Γ-lim inf G_{ε} (u, v) for all u, v \in L^{1} (Ω) .

G (u, v) \leq ε \to 0 Γ-lim inf G_{ε} (u, v) for all u, v \in L^{1} (Ω) .

ε \to 0 lim F_{ε} (u_{ε}, v_{ε}) = F (u, v) .

ε \to 0 lim F_{ε} (u_{ε}, v_{ε}) = F (u, v) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Approximation of the Mumford-Shah Functional by Phase Fields of Bounded Variation

Sandro Belz

Department of Mathematics, Technical University of Munich

Boltzmannstraße 3, 85748 Garching, Germany

[email protected]

and

Kristian Bredies

Institute of Mathematics and Scientific Computing, University of Graz

Heinrichstraße 36, 8010 Graz, Austria

[email protected]

Abstract.

In this paper we introduce a new phase field approximation of the Mumford-Shah functional similar to the well-known one from Ambrosio and Tortorelli. However, in our setting the phase field is allowed to be a function of bounded variation, instead of an $H^{1}$ -function. In the context of image segmentation, we also show how this new approximation can be used for numerical computations, which contains a total variation minimization of the phase field variable, as it appears in many problems of image processing. A comparison to the classical Ambrosio-Tortorelli approximation, where the phase field is an $H^{1}$ -function, shows that the new model leads to sharper phase fields.

Key words and phrases:

Mumford-Shah, free-discontinuity problem, $\Gamma$ -convergence, phase field, image segmentation, image denoising

2010 Mathematics Subject Classification:

49J45, 26A45, 68U10

1. Introduction

The Mumford-Shah functional has been introduced by D. Mumford and J. Shah in [37] in the context of image segmentation. For a given image, $g\in L^{\infty}(\Omega)$ , where $\Omega\subset\mathbb{R}^{n}$ represents the image domain, it is given by

[TABLE]

where $\alpha,\beta,\gamma>0$ are parameters, free to choose. One wants to minimize the functional with respect to $u\in C^{1}(\Omega\setminus\Gamma)$ , being the segmentally denoised approximation of $g$ , and $\Gamma\subset\Omega$ closed, describing the contours of the segments. For $\beta=0$ this functional appeared once more in [30] in the context of fracture mechanics. There, $u$ models the displacement function, and $\Gamma\subset\Omega$ being closed represents the fracture set. The minimization is then restricted to some Dirichlet boundary condition.

As usual in the theory of free-discontinuity problems (see [8, 16]) the Mumford-Shah functional (1.1) is relaxed to the space of special functions of bounded variation (see Section 2.3 for more details on these functions), where the set $\Gamma$ is replaced by the discontinuity set $S_{u}$ . Namely, instead of (1.1) one considers

[TABLE]

for $u\in\mathrm{SBV}(\Omega)$ , the set of special functions of bounded variation. In this setting the existence of the minimizers is well-known and follows from compactness properties of $\mathrm{SBV}(\Omega)\cap L^{\infty}(\Omega)$ and some lower semi-continuity properties (see [7, 5, 6]), using the direct method in the calculus of variations. Furthermore, by the regularity property shown in [26] we know that for any minimizer $u\in\mathrm{SBV}(\Omega)$ of (1.2) the pair $(u,\bar{S}_{u})$ minimizes (1.1).

As already mentioned, in the case of fracture mechanics, we usually have $\beta=0$ and $L^{2}$ -penalization is replaced by a Dirichlet boundary condition. In general, the functional must then be defined on $\mathrm{GSBV}(\Omega)$ , the set of generalized special functions of bounded variation (see Section 2.3), in order to obtain the existence of a minimizer. This is due to the requirement of a uniform bound of the minimizing sequence in the direct method for applying the above-mentioned compactness properties in $\mathrm{SBV}(\Omega)$ . Only for $\beta>0$ , this $L^{\infty}$ -bound is automatically achieved, whereas for $\beta=0$ one has to fall back to $\mathrm{GSBV}(\Omega)$ .

For numerical computations some variational approximations in terms of $\Gamma$ -convergence (see Section 2.2) turned out to be very useful. It guarantees that a convergent sequence of minimizers of the approximating functionals converge to minimizers of the $\Gamma$ -limit. We firstly discuss in Theorem 3.2 an approximation for $\beta=0$ . Hence, we consider the functional

[TABLE]

One of the first and most popular results in this direction was given by L. Ambrosio and V. M. Tortorelli in [10]. They introduced the functionals

[TABLE]

for $u\in H^{1}(\Omega)$ and $v\in H^{1}(\Omega;[0,1])$ and showed via a $\Gamma$ -convergence argument that any limit point $(u,1)$ of a sequence of minimizers $(u_{\varepsilon},v_{\varepsilon})$ of $\mathcal{AT}_{\varepsilon}$ is a minimizer of $\mathcal{MS}$ , provided that $\frac{\eta_{\varepsilon}}{\varepsilon}\to 0$ . Many other approximations based on this result have been proven. Just recently, we proved that the Euclidean norms of the gradients can be replaced by Riemannian norms (see [2]). This result finds application in fracture mechanics applied to surfaces. Another approach considering higher order terms of the phase field has been studied e.g. in [14] and [20]. What happens with the approximation $\mathcal{AT}_{\varepsilon}$ when $\frac{\eta_{\varepsilon}}{\varepsilon}$ does not converge to zero is investigated in [25] and [34]. A totally different idea of approximating $\mathcal{MS}$ by finite differences was proposed by E. De Giorgi and proven by M. Gobbino in [32]. In [18] A. Braides and G. Dal Maso used non-local functionals depending on the average of the gradient of $u$ on small balls. From the work presented in [16] one gets an approximation of $\mathcal{MS}$ for the following functional with small $\varepsilon>0$ :

[TABLE]

for $u\in H^{1}(\Omega)$ and $v\in W^{1,p}(\Omega)$ with $p>1$ and $p^{\prime}$ being the Hölder conjugate of $p$ .

In the approximations (1.3), (1.4) and in the functionals we are going to study in this paper the additional function $v$ works as a phase field variable describing the discontinuity set of $u$ . To be more precise, for small $\varepsilon>0$ the minimizing function $v$ is close to [math] where $u$ is “steep” or jumps, which means in the context of fracture mechanics the presence of a crack and in the context of image segmentation the presence of a contour. Elsewhere, the phase field variable is close to 1 and $u$ is expected to be “flat” in this area. In practice the weights of the different integral terms declare what is meant to be “steep” or “flat”.

In this paper we present a new approximation of the Mumford-Shah functional, allowing the phase field variable $v$ to be in $\mathrm{BV}(\Omega)$ , the set of functions of bounded variation. Namely, as a special case of our main result we consider the functionals

[TABLE]

for $u\in H^{1}(\Omega)$ and $v\in\mathrm{BV}(\Omega)$ , which $\Gamma$ -converge in some sense to $\mathcal{MS}$ and represent the case with $p=1$ in (1.4). From there we derive the required setting for $\beta>0$ .

In this way the phase field variable $v$ can have jumps, which is exploited in the proof of the $\limsup$ -inequality (see Proposition 2.2), when constructing the recovery sequence for the $\Gamma$ -convergence result. Moreover, we expect from this fact that the phase fields become somewhat sharper than the ones obtained from (1.3). We approve this expectation with some numerical computations in the context of image segmentation. The application of this theory to fracture mechanics remains for future work, which includes the studying the convergence behaviour of a time-discrete evolution as it was done – and is still ongoing – for the classical approach of Ambrosio and Tortorelli. For more details on this topic we refer to [31, 3, 1, 38, 39, 36, 4].

The paper is structured as follows: In Section 2 we start with some preliminaries recalling the necessary technical issues. For the versed reader this section might be skipped or only used as a reference text. In Section 3 we formulate our main result, Theorem 3.2, from which we directly infer all other necessary theorems and corollaries. Section 4 is then dedicated to the proof of Theorem 3.2 and in Section 5 we provide some numerical comparison of our new model and the classical Ambrosio-Tortorelli approximation.

2. Preliminaries and Notation

In this section, we collect the notation and the well-known results from the literature which are used in this paper.

With $B_{\rho}(x)$ we denote the Euclidean ball with radius $\rho>0$ and center $x\in\mathbb{R}^{n}$ . For some $S\subset\mathbb{R}^{n}$ the set $B_{\rho}(S)$ refers to the $\rho$ -neighborhood of $S$ . The set $\mathbb{S}^{n-1}$ is the $n-1$ -dimensional sphere in $\mathbb{R}^{n}$ . At some places it is convenient to use the short notation $a\vee b$ and $a\wedge b$ for $\max\{a,b\}$ and $\min\{a,b\}$ , respectively.

The essential supremum and the essential infimum of some measurable function $u$ is written as $\operatorname*{ess\,sup}u$ and $\operatorname*{ess\,inf}u$ , respectively. The essential support of a measurable function $u$ is denoted by $\operatorname{supp}u$ .

2.1. Measure theory

For any set $\Omega\subset\mathbb{R}^{n}$ we denote by $\mathcal{L}^{n}(\Omega)$ the $n$ -dimensional Lebesgue measure and by $\mathcal{H}^{k}(\Omega)$ the $k$ -dimensional Hausdorff measure. Instead of $\mathcal{H}^{0}$ we also use the symbol $\#$ for the counting measure. For a (signed, vector-valued) measure $\mu$ we write $\lvert\mu\rvert$ for its total variation.

2.2. $\Gamma$ -convergence

For some sequence of functionals $(F_{j})$ and a functional $F$ defined on some metric space $X$ we say that $F_{j}$ $\Gamma$ -converges to $F$ as $j\to\infty$ and write $\operatorname*{\Gamma-lim}_{j\to\infty}F_{j}=F$ if there holds the

$\boldsymbol{\liminf}$ **-inequality: **

for all $u\in X$ and all sequences $(u_{j})$ in $X$ with $u_{j}\to u$ there holds

[TABLE]

$\boldsymbol{\limsup}$ **-inequality: **

for all $u\in X$ there exists a sequence $(u_{j})$ in $X$ such that $u_{j}\to u$ and

[TABLE]

One often defines

[TABLE]

Then the $\liminf$ -inequality is equivalent to $F\leq\operatorname*{\Gamma-lim\,inf}_{j\to\infty}F_{j}$ and the $\limsup$ -inequality is equivalent to $\operatorname*{\Gamma-lim\,sup}_{j\to 0}F_{j}\leq F$ . Note that $\operatorname*{\Gamma-lim\,inf}_{j\to\infty}F_{j}$ as well as $\operatorname*{\Gamma-lim\,sup}_{j\to\infty}F_{j}$ are lower semi-continuous.

If one has a family of functionals $(F_{\varepsilon})$ for $\varepsilon\in I\subset\mathbb{R}$ the definition is adapted in the usual way, i.e. $F_{\varepsilon}$ $\Gamma$ -converges to $F$ as $\varepsilon\to a$ (for some $a\in\bar{I}$ ) if $F_{\varepsilon_{j}}$ $\Gamma$ -converges to $F$ for all sequences $(\varepsilon_{j})$ in $I$ with $\varepsilon_{j}\to a$ .

The most important property of $\Gamma$ -convergent sequences is the convergence of minimizers to a minimizer of the limit functional, which is stated in the following proposition.

Proposition 2.1.

Let $(F_{\varepsilon})$ , where $F_{\varepsilon}\colon X\to\mathbb{R}\cup\{\infty\}$ , be a sequence of functionals $\Gamma$ -converging to $F\colon X\to\mathbb{R}\cup\{\infty\}$ with respect to the metric space $X$ . Assume that $\inf_{X}F_{\varepsilon}=\inf_{K}F_{\varepsilon}$ for some compact set $K\subset X$ . Then, there holds $\lim_{\varepsilon\to 0}\inf_{X}F_{\varepsilon}=\inf_{X}F$ . Furthermore, for any sequence $(u_{\varepsilon})$ in $X$ converging to $u\in X$ with $F_{\varepsilon}(u_{\varepsilon})=\inf_{X}F_{\varepsilon}$ we have $F(u)=\inf_{X}F$ .

If $F=\operatorname*{\Gamma-lim}_{j\to\infty}F_{j}$ and $u\in X$ , a sequence $(u_{j})$ , for which (2.2) holds, is called a recovery sequence for $u$ , and there clearly holds $\lim_{j\to\infty}F_{j}(u_{j})=F(u)$ . It is actually the case that a convergent sequence of minimizers is a recovery sequence for the minimizer of the $\Gamma$ -limit. For this reason knowing the recovery sequences provides lots of information about the structure of the limit behaviour of the functional sequence.

For more details on the concept of $\Gamma$ -convergence we refer to [17] and [24].

2.3. Functions of bounded variation

In the following we describe the concept and some essential results of functions of bounded variation. For an extensive monograph on this topic we refer to [8]. A more basic introduction can be found in [28].

Let $\Omega\subset\mathbb{R}^{n}$ be non-empty and open for the rest of this section. The set of functions of bounded variation, in short $\mathrm{BV}(\Omega)$ , contains all functions $u\in L^{1}(\Omega)$ whose distributional derivative is a Radon measure, denoted by $\mathrm{D}u$ , i.e. there holds

[TABLE]

Defining the total variation

[TABLE]

we obtain from the Riesz representation theorem that (2.3) is equivalent to $V(u,\Omega)<\infty$ . Furthermore, there holds $\lvert\mathrm{D}u\rvert(\Omega)=V(u,\Omega)$ for all $u\in\mathrm{BV}(\Omega)$ .

For any measurable function $u\colon\Omega\to\mathbb{R}$ we define for all $x\in\Omega$ the upper and lower approximate limit, respectively, by

[TABLE]

For all $x\in\Omega$ there obviously holds $u^{-}(x)\leq u^{+}(x)$ . If $u^{-}(x)=u^{+}(x)$ we write for their common value $u^{\ast}(x)$ . The set $S_{u}$ is the discontinuity set containing all those points $x\in\Omega$ for which there holds $u^{-}(x)<u^{+}(x)$ .

In what follows let $u\in\mathrm{BV}(\Omega)$ . Then, $S_{u}$ has Lebesgue measure 0 and for $\mathcal{H}^{n-1}$ -almost all points $x\in S_{u}$ one can find a unit normal vector $\nu_{u}(x)$ such that $u^{+}(x)=\bigl{(}u\rvert_{H^{+}(x)}\bigr{)}^{\ast}(x)$ and $u^{-}(x)=\bigl{(}u\rvert_{H^{-}(x)}\bigr{)}^{\ast}(x)$ with

[TABLE]

If this is the case one says that $x$ is a jump point. We call $\tilde{u}$ a precise representative of $u$ if $\tilde{u}(x)=u^{\ast}(x)$ for all $x\in\Omega\setminus S_{u}$ and $\tilde{u}(x)=\frac{1}{2}(u^{+}(x)+u^{-}(x))$ for all jump points $x\in S_{u}$ .

For functions of bounded variation on the real line we actually have that every point in $S_{u}$ is a jump point. Furthermore, on an open interval the pointwise variation of $\tilde{u}$ and the variation as defined in (2.4) coincide. Precisely, for $a<b$ and $u\in\mathrm{BV}(a,b)$ there holds

[TABLE]

For any $u\in\mathrm{BV}(\Omega)$ one can split the measure $\mathrm{D}u$ in the following way

[TABLE]

where $\mathrm{D}^{a}u=\nabla u\mathcal{L}^{n}$ denotes the absolutely continuous part of $\mathrm{D}u$ with respect to the Lebesgue measure. Therefore, with $\nabla u$ we denote its density function, which we also call the approximate gradient of $u$ . With $\mathrm{D}^{j}u$ we denote the jump part of $u$ , which can be written as $\mathrm{D}^{j}u=(u^{+}-u^{-})\cdot\nu_{u}\mathcal{H}^{n-1}\llcorner S_{u}$ , and $\mathrm{D}^{c}u$ is the Cantor part.

The set of special functions of bounded variation, denoted by $\mathrm{SBV}(\Omega)$ , contains those functions of bounded variation whose Cantor part is zero, i.e. we have $\mathrm{SBV}(\Omega)=\{u\in\mathrm{BV}(\Omega):\mathrm{D}^{c}u=0\}$ . The singular part of such functions is therefore only concentrated on the set of jump points.

A measurable function $u\colon\Omega\to\mathbb{R}$ is a generalized special function of bounded variation, where we write $u\in\mathrm{GSBV}(\Omega)$ , if any truncation of $u$ is locally a special function of bounded variation, i.e. $u^{M}\in\mathrm{SBV}_{\text{loc}}(\Omega)$ for all $M>0$ , with $u^{M}=(-M)\vee u\wedge M$ . Note that for $u\in\mathrm{GSBV}(\Omega)$ we cannot define $\nabla u$ as above, because the distributional derivative does not need to be a measure on that space. However, $\nabla u^{M}$ is well defined for all $M>0$ and converges pointwise a.e. as $M\to\infty$ . Thus, we simply define $\nabla u(x)=\lim_{M\to\infty}\nabla u^{M}(x)$ for a.e. $x\in\Omega$ . Furthermore, one can show that $S_{u}=\bigcup_{M>0}S_{u^{M}}$ . These results and more details can be found in [8, Section 4.5] and the references therein.

Moreover, we will use the following two subspaces of $\mathrm{GSBV}(\Omega)$ and $\mathrm{SBV}(\Omega)$ defined for every $p>0$ by

[TABLE]

A density result, which plays an important role in the proof of the $\limsup$ -inequality for our main assertion, is stated in the next theorem. It follows directly from [23, Theorem 3.1] and the following remarks therein.

Theorem 2.2.

Let $\Omega\subset\mathbb{R}^{n}$ be non-empty, open and bounded with Lipschitz boundary, and take $u\in\mathrm{SBV}^{2}(\Omega)\cap L^{\infty}(\Omega)$ . Then, there exists a sequence $(w_{j})$ in $\mathrm{SBV}^{2}(\Omega)\cap L^{\infty}(\Omega)$ such that

$\displaystyle\overline{S_{w_{j}}}\text{ is a polyhedral set,}$ ** 2. 2.

$\displaystyle\mathcal{H}^{n-1}\bigl{(}\overline{S_{w_{j}}}\setminus S_{w_{j}}\bigr{)}=0\,,$ ** 3. 3.

$w_{j}\in W^{1,\infty}(\Omega\setminus S_{w_{j}})\text{ for all }j\in\mathbb{N}\,,$ ** 4. 4.

$\displaystyle w_{j}\to u\text{ in }L^{1}(\Omega)\text{ as }j\to\infty\,,$ ** 5. 5.

$\displaystyle\nabla w_{j}\to\nabla u\text{ in }L^{2}(\Omega)\text{ as }j\to\infty\,,$ ** 6. 6.

$\displaystyle\mathcal{H}^{n-1}(S_{w_{j}})\to\mathcal{H}^{n-1}(S_{w})\text{ as }j\to\infty\,.$ **

We now shortly introduce the concept of slicing, which is essential for the proof of the $\liminf$ -inequality. For that purpose, let $\Omega\subset\mathbb{R}^{n}$ be open and bounded, and let $\xi\in\mathbb{S}^{n-1}$ be a unique normal vector. Then, we write $\Omega_{\xi}$ for the projection of $\Omega$ onto $\xi^{\perp}$ , and we set

[TABLE]

Furthermore, for any function $u\in L^{1}(\Omega)$ and for $\mathcal{L}^{n-1}$ -a.a. $y\in\Omega_{\xi}$ we can define $u_{y}^{\xi}(t)\coloneq u(y+t\xi)$ for a.a. $t\in\Omega^{\xi}_{y}$ .

One can show the following important results revealing the connection between a function $u\in\mathrm{SBV}(\Omega)$ and its sliced functions $u_{y}^{\xi}$ . There are more general results for $\mathrm{BV}$ -functions, which are not needed in this context. The interested reader can find the details in [8, Section 3.11].

Theorem 2.3.

Let $u\in L^{1}(\Omega)$ . Then $u\in\mathrm{SBV}(\Omega)$ if and only if for all $\xi\in\mathbb{S}^{n-1}$ there holds $u_{y}^{\xi}\in\mathrm{SBV}(\Omega_{y}^{\xi})$ for $\mathcal{L}^{n-1}$ -a.a. $y\in\Omega_{\xi}$ and

[TABLE]

Furthermore, if $u\in\mathrm{BV}(\Omega)$ there holds for all $\xi\in\mathbb{S}^{n-1}$ , for $\mathcal{L}^{n-1}$ -a.a. $y\in\Omega_{\xi}$ and for a.a. $t\in\Omega_{y}^{\xi}$ :

$(u_{y}^{\xi})^{\prime}(t)=\bigl{\langle}\nabla u(y+t\xi),\xi\bigr{\rangle}$ , 2. 2.

$S_{u_{y}^{\xi}}=(S_{u})_{y}^{\xi}$ , 3. 3.

$(u_{y}^{\xi})^{\pm}~(t)=u^{\pm}(y+t\xi)$ , 4. 4.

$\bigl{\lvert}\langle\mathrm{D}^{\ast}u,\xi\rangle\bigr{\rvert}(\Omega)=\int_{\Omega_{\xi}}\bigl{\lvert}\mathrm{D}^{\ast}u_{y}^{\xi}\bigr{\rvert}(\Omega^{\xi}_{y})\,\mathrm{d}\mathcal{L}^{n-1}(y)\quad\text{ for }\ast=a,j,c$ .

The following corollary directly follows by a truncation argument.

Corollary 2.4.

Let $u\in L^{1}(\Omega)$ . Then $u\in\mathrm{GSBV}(\Omega)$ if and only if for all $\xi\in\mathbb{S}^{n-1}$ there holds $u_{y}^{\xi}\in\mathrm{SBV}(\Omega_{y}^{\xi})$ for $\mathcal{L}^{n-1}$ -a.e. $y\in\Omega_{\xi}$ and

[TABLE]

2.4. Convex functions

Especially, for the numerical part of this paper we also need some theory about convex functions. A good reference for this topic is [33] and [27].

For $\Omega\subset\mathbb{R}^{n}$ the characteristic function $\chi_{\Omega}$ over $\Omega$ is given by $\chi_{\Omega}=0$ on $\Omega$ and $\chi_{\Omega}=+\infty$ on $\mathbb{R}^{n}\setminus\Omega$ . It is a convex function if and only if $\Omega$ is a convex set. For any function $f\colon\Omega\to\mathbb{R}$ , bounded from below by some affine function, $f^{\ast}\colon\mathbb{R}^{n}\to\mathbb{R}$ denotes its convex conjugate, i.e.

[TABLE]

where $f$ is set to $+\infty$ outside of $\Omega$ . This definition directly yields Fenchel’s inequality, which says

[TABLE]

We remark that $f^{\ast}$ is always convex and lower semi-continuous and the biconjugate $f^{\ast\ast}=(f^{\ast})^{\ast}$ is the lower semi-continuous convex hull of $f$ . Furthermore, $f$ is convex and lower semi-continuous if and only if $f=f^{\ast\ast}$ .

We will also make use of the subdifferential of a function $f\colon\mathbb{R}^{n}\to(-\infty,+\infty]$ , which we denote by $\partial f$ . It is given by

[TABLE]

If $f$ is differentiable in $x\in\mathbb{R}^{n}$ , we have $\partial f(x)=\{\nabla f(x)\}$ .

3. Main Result

For our main result we need several, quite technical assumptions. In order to keep a better overview we first list them here.

Assumption 3.1.

Let $\varepsilon_{0}>0$ . For each $0<\varepsilon<\varepsilon_{0}$ let

[A1]

$W_{\varepsilon}\colon[0,1]\to[0,\infty)$ be continuous such that $W_{\varepsilon}\to W$ in $L^{1}([0,1])$ as $\varepsilon\to 0$ for some $W\in L^{1}([0,1])$ with $1\in\operatorname{supp}W$ . 2. [A2]

$\varphi_{\varepsilon}\colon W_{\varepsilon}([0,1])\to\mathbb{R}$ be a convex function such that $\varphi_{\varepsilon}(W_{\varepsilon}(1))\to 0$ and $\varphi_{\varepsilon}(W_{\varepsilon}(\cdot))\to+\infty$ uniformly on $[0,T]$ for all $0<T<1$ , i.e. for all $C>0$ there exists $0<\tilde{\varepsilon}<\varepsilon_{0}$ such that $\varphi_{\varepsilon}(W_{\varepsilon}(t))>C$ for all $t\in[0,T]$ and $\varepsilon<\tilde{\varepsilon}$ . 3. [A3]

$\psi_{\varepsilon}\colon[0,\infty)\to[0,\infty)$ be a convex function such that $\lim_{t\to\infty}\frac{\psi_{\varepsilon}(t)}{t}=c_{\varepsilon}<\infty$ , $\psi_{\varepsilon}(0)\to 0$ and $c_{\varepsilon}\to c_{0}:=\int_{0}^{1}W(s)\;\mathrm{d}s$ as $\varepsilon\to 0$ for $W$ from [A1], $\varphi^{\ast}_{\varepsilon}\leq\psi_{\varepsilon}$ on $[0,\infty)$ , where $\varphi^{\ast}_{\varepsilon}$ denotes the convex conjugate of $\varphi_{\varepsilon}$ (see Section 2.4), and $\psi_{\varepsilon}(t)\geq ct+d$ for all $t\geq 0$ and some $c>0$ , $d\in\mathbb{R}$ independent of $\varepsilon$ . 4. [A4]

$\eta_{\varepsilon}>0$ such that $\eta_{\varepsilon}\varphi_{\varepsilon}(W_{\varepsilon}(0))\to 0$ as $\varepsilon\to 0$ .

Furthermore, assume that

[A5]

$f\colon[0,1]\to[0,\infty)$ is a Lipschitz continuous, non-decreasing function with $f(0)=0$ and $f>0$ on $(0,1]$ .

We are now ready to state our main theorem.

Theorem 3.2.

Let $\Omega\subset\mathbb{R}^{n}$ be a non-empty, open, bounded set with Lipschitz boundary, let $W_{\varepsilon}$ , $\varphi_{\varepsilon}$ , $\psi_{\varepsilon}$ , $\eta_{\varepsilon}$ , $f$ and $c_{\varepsilon},c_{0}>0$ be given as in Assumption 3.1. For each $\varepsilon>0$ , we define the functional $F_{\varepsilon}\colon L^{1}(\Omega)\times L^{1}(\Omega)\to\mathbb{R}$ by

[TABLE]

for all $u\in H^{1}(\Omega),v\in\mathrm{BV}(\Omega;[0,1])$ and $F_{\varepsilon}(u,v)\coloneq+\infty$ otherwise.

Moreover, define $F\colon L^{1}(\Omega)\times L^{1}(\Omega)\to\mathbb{R}$ by

[TABLE]

Then $F=\operatorname*{\Gamma-lim}_{\varepsilon\to 0}F_{\varepsilon}$ with respect to the strong topology in $L^{1}(\Omega)\times L^{1}(\Omega)$ .

For our application in image segmentation we aim for a minimization of $\eqref{SBV_MS}$ . In the following corollary we add the missing $L^{2}$ -penalization term in the functionals $F$ and $F_{\varepsilon}$ .

Corollary 3.3.

Let $\Omega\subset\mathbb{R}^{n}$ be a non-empty, open, bounded set with Lipschitz boundary, let $W_{\varepsilon}$ , $\varphi_{\varepsilon}$ , $\psi_{\varepsilon}$ , $\eta_{\varepsilon}$ , $f$ and $c_{\varepsilon},c_{0}>0$ be given as in Assumption 3.1. Furthermore, let $\beta>0$ and $g\in L^{\infty}(\Omega)$ , and let $F_{\varepsilon}$ and $F$ be given as in Theorem 3.2. We define for every $\varepsilon>0$ the functional

[TABLE]

Moreover, we define $G\colon L^{1}(\Omega)\times L^{1}(\Omega)\to\mathbb{R}$ by

[TABLE]

Then, $G=\operatorname*{\Gamma-lim}_{\varepsilon\to 0}G_{\varepsilon}$ with respect to the strong topology in $L^{1}(\Omega)\times L^{1}(\Omega)$ .

Proof.

Since $u\mapsto\int_{\Omega}\lvert u-g\rvert^{2}\,\mathrm{d}x$ is lower semi-continuous, the $\liminf$ -inequality follows directly from Theorem 3.2. Hence, we have

[TABLE]

In order to show the $\limsup$ -inequality it suffices to consider $u\in\mathrm{SBV}^{2}(\Omega)\cap L^{\infty}(\Omega)$ and $v=1$ a.e., since otherwise, the left hand side of (3.2) is $+\infty$ and there is nothing to show.

From Theorem 3.2 we know that there exists a sequence $(u_{\varepsilon},v_{\varepsilon})$ in $H^{1}(\Omega)\times\mathrm{BV}(\Omega)$ converging to $(u,v)$ as $\varepsilon\to 0$ in the strong $L^{1}(\Omega)\times L^{1}(\Omega)$ -topology such that

[TABLE]

We consider the truncated function sequence $u_{\varepsilon}^{M}$ with $M=\lVert u\rVert_{L^{\infty}}$ , and note that $u_{\varepsilon}^{M}\to u$ in $L^{2}(\Omega)$ as $\varepsilon\to 0$ . Therefore, we also have

[TABLE]

Furthermore, one can easily verify that $F_{\varepsilon}(u_{\varepsilon}^{M},v_{\varepsilon})\leq F_{\varepsilon}(u_{\varepsilon},v_{\varepsilon})$ , so that

[TABLE]

which is the required $\limsup$ -inequality. ∎

In view of Proposition 2.1 the existence and compactness of minimizers of the approximating functionals $G_{\varepsilon}$ needs to be shown in order to obtain their convergence to a minimizer of the functional $G$ . We give a rigorous proof in the following theorem.

Theorem 3.4.

In the setting of Corollary 3.3 a minimizer of $G_{\varepsilon}$ exists for every $\varepsilon>0$ . Furthermore, let $\varepsilon_{j}$ be an infinitesimal sequence, and let $(u_{\varepsilon_{j}},v_{\varepsilon_{j}})\in H^{1}(\Omega)\times\mathrm{BV}(\Omega;[0,1])$ be a minimizer of $G_{\varepsilon_{j}}$ for every $j\in\mathbb{N}$ . Then, $v_{\varepsilon_{j}}\to 1$ in $L^{1}(\Omega)$ , and there exists $u\in\mathrm{SBV}^{2}(\Omega)\cap L^{\infty}(\Omega)$ , such that, up to a subsequence, $u_{\varepsilon_{j}}\to u$ in $L^{1}(\Omega)$ , and $(u,1)$ minimizes $G$ .

Proof.

In order to show the existence of minimizers of $G_{\varepsilon}$ we fix $\varepsilon>0$ and take a minimizing sequence $(u_{j},v_{j})$ of $G_{\varepsilon}$ , i.e.

[TABLE]

In view of $\psi_{\varepsilon}(t)\geq ct+d$ for all $t\geq 0$ and some $c>0$ as stated in Assumption [A3], it is easy to see that $(\lvert\mathrm{D}v_{j}\rvert(\Omega))$ is bounded. Further, $(u_{j})$ is bounded in $H^{1}(\Omega)$ , since $\eta_{\varepsilon}>0$ . By the compactness properties of $\mathrm{BV}(\Omega)$ (see [8, Theorem 3.23]) and $H^{1}(\Omega)$ there exist subsequences of $(u_{j})$ and $(v_{j})$ (not relabeled) and functions $v\in\mathrm{BV}(\Omega)$ and $u\in H^{1}(\Omega)$ , such that $v_{j}\to v$ in $L^{1}(\Omega)$ , $\mathrm{D}v_{j}$ converges sequentially weakly* (in the space of Radon measures) to $\mathrm{D}v$ and $u_{j}\rightharpoonup u$ weakly in $H^{1}(\Omega)$ .

From Fatou’s Lemma and [8, Theorems 5.4 and 5.8] we get the lower semi-continuity of $G_{\varepsilon}$ so that

[TABLE]

Hence, the pair $(u,v)$ minimizes $G_{\varepsilon}$ .

Now let $(\varepsilon_{j})$ be a sequence converging to [math], and let the pair $(u_{\varepsilon_{j}},v_{\varepsilon_{j}})$ be a minimizer of $G_{\varepsilon_{j}}$ for every $j\in\mathbb{N}$ . Then we simply have

[TABLE]

which implies, together with Assumption [A2], $v_{\varepsilon_{j}}\to 1$ in $L^{1}(\Omega)$ as $\varepsilon_{j}\to 0$ .

From a simple cut-off argument we get that $\lVert u_{\varepsilon_{j}}\rVert_{L^{\infty}(\Omega)}\leq\lVert g\rVert_{L^{\infty}(\Omega)}$ . Since $f$ is Lipschitz continuous according to Assumption [A5], we have $f(v_{\varepsilon_{j}})\in\mathrm{BV}(\Omega)$ with $\mathrm{D}f(v_{\varepsilon_{j}})$ obeying the chain rule for $\mathrm{BV}$ -functions (see [8, Theorem 3.99]). Further, the multiplication operation $(s,t)\mapsto st$ is continuously differentiable and Lipschitz continuous on bounded sets, thus, since both $u_{\varepsilon_{j}}$ and $f(v_{\varepsilon_{j}})$ are a.e. bounded, the product rule for $\mathrm{BV}$ -functions holds (see [8, Theorem 3.99]), giving $w_{\varepsilon_{j}}\coloneq u_{\varepsilon_{j}}f(v_{\varepsilon_{j}})\in\mathrm{BV}(\Omega)$ . We moreover have

[TABLE]

Since $u_{\varepsilon_{j}}$ , $f$ and $f^{\prime}$ are bounded, we can estimate, employing a weighted version of the Cauchy–Schwarz inequality and Young’s inequality,

[TABLE]

where $C>0$ depends on $f$ and $\Omega$ .

By Assumption [A3], $t\leq c^{-1}(\psi_{\varepsilon_{j}}(t)-d)$ for all $t\geq 0$ and some $c>0$ , $d\in\mathbb{R}$ , so

[TABLE]

with $C>0$ suitably chosen. Altogether, since $(c_{\varepsilon})$ is bounded, we obtain

[TABLE]

where here $C>0$ is a constant depending on $\Omega$ , $g$ , $f$ and $c_{0}$ . Hence, $\lvert\mathrm{D}w_{\varepsilon_{j}}\rvert(\Omega)$ is bounded.

Clearly, $w_{\varepsilon_{j}}$ is pointwise a.e. bounded independent of $j$ , so by the compactness properties of $\mathrm{BV}(\Omega)$ (see [8, Theorem 3.23]) there exists a subsequence of $\varepsilon_{j}$ (not relabeled) converging to 0, such that $w_{\varepsilon_{j}}$ converges to some $w$ in $L^{1}(\Omega)$ . Since $v_{\varepsilon_{j}}\to 1$ a.e. and $f$ is continuous, we also have that $u_{\varepsilon_{j}}=w_{\varepsilon_{j}}/f(v_{\varepsilon_{j}})\to w/f(1)$ a.e. as $\varepsilon_{j}\to 0$ . Since $\lVert u_{\varepsilon_{j}}\rVert_{L^{\infty}(\Omega)}\leq\lVert g\rVert_{L^{\infty}(\Omega)}$ , the Dominated Convergence Theorem yields $u_{\varepsilon_{j}}\to w/f(1)$ in $L^{1}(\Omega)$ .

The assertion now follows from Proposition 2.1 and Corollary 3.3. ∎

*Remark 3.5**.*

Note that Theorem 3.2 and Corollary 3.3 also holds for $\eta_{\varepsilon}=0$ . However, for the existence of minimizers of $G_{\varepsilon}$ we require $\eta_{\varepsilon}>0$ as indicated in the proof of Theorem 3.4.

The following corollary represents a special case of the previous results, which represents the version that is relevant for our numerical computation in Section 5.

Corollary 3.6.

Let $\Omega\subset\mathbb{R}^{n}$ be a non-empty, open, bounded set with Lipschitz boundary and let $\alpha,\beta,\gamma>0$ . For each $\varepsilon>0$ let $\eta_{\varepsilon}>0$ such that $\frac{\eta_{\varepsilon}}{\varepsilon}\to 0$ as $\varepsilon\to 0$ and define the functionals $G_{\varepsilon}\colon L^{1}(\Omega)\times L^{1}(\Omega)\to\mathbb{R}$ by

[TABLE]

if $u\in H^{1}(\Omega),v\in\mathrm{BV}(\Omega;[0,1])$ and $G_{\varepsilon}(u,v)\coloneq+\infty$ otherwise. Moreover, define $G\colon L^{1}(\Omega)\times L^{1}(\Omega)\to\mathbb{R}$ by

[TABLE]

for $u\in\mathrm{SBV}^{2}(\Omega)\cap L^{\infty}(\Omega)$ , $v=1$ a.e., and $G(u,v)=+\infty$ otherwise.

Then, for every infinitesimal sequence $(\varepsilon_{j})$ a minimizer $(u_{\varepsilon_{j}},v_{\varepsilon_{j}})$ of $G_{\varepsilon_{j}}$ exists for every $j\in\mathbb{N}$ . Furthermore, $v_{\varepsilon_{j}}\to 1$ in $L^{1}(\Omega)$ , and up to a subsequence $u_{\varepsilon_{j}}\to u$ in $L^{1}(\Omega)$ with $(u,1)$ being a minimizer of $G$ .

Proof of Corollary 3.6.

We define $\tilde{G}_{\varepsilon}\coloneq\frac{2}{\gamma}G_{\varepsilon}$ and, choose the functions $f$ , $W_{\varepsilon}$ , $\varphi_{\varepsilon}$ and $\psi_{\varepsilon}$ in the following way:

[TABLE]

for all $t\in[0,1],s\in[0,\infty)$ and $0<\varepsilon<1$ . Note that in this setting we have

[TABLE]

and hence, one can simply verify that Assumption 3.1 is fulfilled with $c_{0}=1$ .

From Theorem 3.2 we, therefore, get that $\tilde{G}_{\varepsilon}$ $\Gamma$ -converges to $\tilde{G}\coloneq\frac{2}{\gamma}G$ . Since $\Gamma$ -convergence is preserved under constant multiplication we get the result by multiplying $\tilde{G}_{\varepsilon}$ and $\tilde{G}$ with $\frac{\gamma}{2}$ . ∎

4. Proof of Theorem 3.2

The proof of Theorem 3.2 follows the usual strategy that has been used for the classical Ambrosio-Tortorelli approximation and various generalizations (see [9, 10, 16, 25, 34, 35]). Firstly, we show the $\liminf$ -inequality on the real line (see Proposition 4.1). The generalization to the multi-dimensional case, stated in Proposition 4.2, is then shown by a slicing argument.

The $\limsup$ -inequality is shown with the help of the density result, Theorem 2.2. Here, we exploit the fact that the phase field variable is allowed to have jumps, which enables the construction of a much simpler recovery sequence than when the phase field needs to be smooth.

Proposition 4.1.

In the setting of Theorem 3.2 with $\Omega\subset\mathbb{R}$ we redefine $F\colon L^{1}(\Omega)\times L^{1}(\Omega)\to\mathbb{R}$ by

[TABLE]

Then there holds $F\leq\operatorname*{\Gamma-lim\,inf}_{\varepsilon\to 0}F_{\varepsilon}$ .

Proof.

First of all, for each open set $I\subset\Omega$ we define the localized functionals

[TABLE]

for all $u\in H^{1}(I)$ and $v\in\mathrm{BV}(I;[0,1])$ , and $F_{\varepsilon}(u,v;I)\coloneq+\infty$ otherwise.

Now, let $(\varepsilon_{j})$ be a sequence greater than zero with $\varepsilon_{j}\to 0$ as $j\to\infty$ , and let $(u_{j})$ and $(v_{j})$ be sequences in $L^{1}(\Omega)$ such that $u_{j}\to u$ and $v_{j}\to v$ as $j\to\infty$ . By possibly extracting a subsequence, we can assume that

[TABLE]

Therefore, we must have $\int_{\Omega}\varphi_{\varepsilon_{j}}(W_{\varepsilon_{j}}(v_{j}))\,\mathrm{d}x<\infty$ , and because of to the uniform convergence of $\varphi_{\varepsilon_{j}}(W_{\varepsilon_{j}}(\cdot))$ to $+\infty$ as $\varepsilon\to 0$ (see [A2]), we obtain that $v=1$ a.e. on $\Omega$ .

We first show that $\#S_{u}$ is finite and

[TABLE]

For that let $y_{0}\in S_{u}$ , and let $\delta>0$ sufficiently small such that $B_{\delta}(y_{0})\subset\Omega$ . Set $M\coloneq\liminf_{j\to\infty}\operatorname*{ess\,inf}_{B_{\frac{\delta}{2}}(y_{0})}(f\circ v_{j})$ and assume that $M>0$ . Furthermore, let $0<\kappa<M$ and choose $j_{0}>0$ such that up to a subsequence, there holds $M<\operatorname*{ess\,inf}_{B_{\frac{\delta}{2}}(y_{0})}(f\circ v_{j})+\kappa$ for all $j>j_{0}$ . Then there holds

[TABLE]

so that $u_{j}^{\prime}$ converges weakly to $u^{\prime}$ in $L^{2}(B_{\frac{\delta}{2}}(y_{0}))$ and consequently $y_{0}\notin S_{u}$ . Hence, we must have $M=0$ , and we can find a sequence $(y_{j})$ such that $f(\tilde{v}_{j}(y_{j}))\to 0$ , where $\tilde{v}_{j}$ is a precise representative of $v_{j}$ . The assumptions on $f$ in [A5] imply $\tilde{v}_{j}(y_{j})\to 0$ as $j\to\infty$ . Since $\tilde{v}_{j}\to 1$ a.e. we can, therefore, find $y^{+},y^{-}\in B_{\delta}(y_{0})$ with $y^{-}<y_{0}<y^{+}$ such that $\tilde{v}_{j}(y^{-})\to 1$ as well as $\tilde{v}_{j}(y^{+})\to 1$ .

With this at hand we get from the $L^{1}$ -convergence of $W_{\varepsilon}$ (see [A1]),

[TABLE]

Defining

[TABLE]

we get, for $j$ large enough,

[TABLE]

and together with (2.5)

[TABLE]

Applying Lemma A.2 yields

[TABLE]

By merging (4.2), (4.4) and (4.5) and since $c_{\varepsilon}\to c_{0}$ as $\varepsilon\to 0$ (see [A3]) we deduce

[TABLE]

For every $N\leq\#S_{u}$ we can repeat the preceding arguments for each element in a set $\{y_{1},\dotsc,y_{N}\}\subset S_{u}$ with $\delta>0$ sufficiently small such that $B_{\delta}(y_{k})\cap B_{\delta}(y_{\ell})=\emptyset$ for $k\neq\ell$ in order to obtain

[TABLE]

By assumption the right hand side is finite; hence, there must hold $\#S_{u}<\infty$ and we deduce (4.1).

In the next step we show that for all $\delta>0$ ,

[TABLE]

Let $I\coloneq(a,b)\subset\Omega$ be an open interval such that $I\cap S_{u}=\emptyset$ . For $k\in\mathbb{N}$ and $\ell\in\{1,\dotsc,k\}$ we define the intervals

[TABLE]

and we extract a subsequence of $(v_{j})$ (not relabeled) such that $\lim_{j\to\infty}\operatorname*{ess\,inf}_{I^{k}_{\ell}}v_{j}$ exists for all $\ell$ . Moreover, for $0<z<1$ we define the set

[TABLE]

For every $\ell\in T^{k}_{z}$ there exists a sequence $(x_{j})$ in $I^{k}_{\ell}$ and $y\in I^{k}_{\ell}$ such that

[TABLE]

Thus, analogously to the above it follows that

[TABLE]

for some $C>0$ by assumption.

Repeating this argument for every $\ell\in T^{k}_{z}$ we get

[TABLE]

Note that in view of [A1] there holds $\int_{z}^{1}W(s)\,\mathrm{d}s>0$ , and hence, $\#T_{z}^{k}$ is bounded independently of $k$ . Because $\#T_{z}^{k}$ is also non-decreasing with respect to $k$ it remains constant for $k$ large enough. As a consequence, we can pick $\ell^{k}_{1}<\ell_{2}^{k}<\dotsb<\ell^{k}_{N_{z}}\in T^{k}_{z}$ with $N_{z}\coloneq\max_{k\in\mathbb{N}}\bigl{(}\#T^{k}_{z}\bigr{)}$ , such that each $\ell^{k}_{i}/k$ converges to some $\theta_{i}\in[0,1]$ as $k\to\infty$ . Define $T_{z}\coloneq\{y_{1},\dotsc,y_{N}\}$ with $y_{i}\coloneq a+\theta_{i}(b-a)$ . Let $\rho>0$ , choose $k>2(b-a)/\rho$ large enough, and let $\ell\in T_{z}^{k}$ . Then we have $I^{k}_{\ell}\subset B_{\rho}(T_{z})$ . Therefore,

[TABLE]

From [A5] we have $f(z)>0$ , and thus, we obtain $u_{j}^{\prime}\rightharpoonup u^{\prime}$ in $L^{2}(I\setminus B_{\rho}(T_{z}))$ up to a subsequence, and consequently $u\in H^{1}(I\setminus B_{\rho}(T_{z}))$ . By the weak lower semi-continuity of the norm we have

[TABLE]

Since this inequality holds for all $\rho>0$ , we have $u\in H^{1}(I\setminus T_{z})$ , and since $u\in\mathrm{SBV}(I)$ with $I\cap S_{u}=\emptyset$ , we deduce that $u\in H^{1}(I)$ . Taking the limit for $\rho\to 0$ results in

[TABLE]

Finally, we take the limit for $z\to 1$ and obtain (note that $f$ is continuous from [A5])

[TABLE]

Since $I\subset\Omega$ was chosen arbitrarily such that $I\cap S_{u}=\emptyset$ we end up with (4.6). Together with (4.1) we obtain

[TABLE]

and we conclude the proof by taking the limit for $\delta\to 0$ . ∎

Proposition 4.2.

In the setting of Theorem 3.2 there holds

[TABLE]

Proof.

For the proof we use the usual notation in the setting of slicing, introduced in Section 2.3. In what follows let $\xi\in\mathbb{S}^{n-1}$ and $y\in\Omega_{\xi}$ , let $A\subset\Omega$ be open and choose $u,v\in L^{1}(\Omega)$ arbitrarily. We define the localized version of (3.1) by

[TABLE]

if $u\in H^{1}(A),v\in\mathrm{BV}(A;[0,1])$ and $F_{\varepsilon}(u,v;A)\coloneq{+}\infty$ otherwise. Furthermore, we define for $I\subset\mathbb{R}$ open

[TABLE]

if $u\in H^{1}(I),v\in\mathrm{BV}(I;[0,1])$ and $F_{\varepsilon}(u,v;I)\coloneq{+}\infty$ otherwise. We additionally set

[TABLE]

From Fubini’s theorem and Theorem 2.3 we therefore obtain

[TABLE]

if $\lvert\langle\mathrm{D}u,\xi\rangle\rvert$ is absolutely continuous with respect to $\mathcal{L}^{n}$ , and $F^{\xi}_{\varepsilon}(u,v;A)={+}\infty$ otherwise. Thus, there clearly holds

[TABLE]

From Proposition 4.1 we know that $\overline{F}(u,v;I)\leq\operatorname*{\Gamma-lim\,inf}_{\varepsilon\to 0}\overline{F}_{\varepsilon}(u,v;I)$ with

[TABLE]

Choosing

[TABLE]

there holds for all sequences $(u_{j})$ and $(v_{j})$ with $u_{j}\to u$ and $v_{j}\to v$ in $L^{1}(\Omega)$ as $j\to\infty$

[TABLE]

Fatou’s Lemma and (4.8) yield

[TABLE]

Moreover, by construction, $F^{\xi}(u,v;A)$ is finite if and only if for a.a. $y\in A_{\xi}$ there holds $v_{y}^{\xi}=1$ a.e. on $A_{y}^{\xi}$ , $u_{y}^{\xi}\in\mathrm{SBV}^{2}(A_{y}^{\xi})$ as well as

[TABLE]

Since there holds for every $M>0$ and every $u\in L^{1}(\Omega)$ with $u_{y}^{\xi}\in\mathrm{SBV}^{2}(A_{y}^{\xi})$ for a.a. $y\in A_{\xi}$

[TABLE]

we get by Corollary 2.4 that $F^{\xi}(u,v;A)$ is finite only if $u\in\mathrm{GSBV}^{2}(A)$ and $v=1$ a.e. in $A$ . Hence,

[TABLE]

if $u\in\mathrm{GSBV}^{2}(A)$ and $v=1$ a.e. in $A$ , and $F^{\xi}(u,v;A)={+}\infty$ otherwise.

Since $A$ and $\xi$ were chosen arbitrarily, if $v=1$ a.e. in $A$ , then [16, Theorem 1.16] and (4.9) imply

[TABLE]

Otherwise, the $\liminf$ -inequality follows directly from (4.9) with $\xi$ arbitrary. ∎

The following proposition now shows the $\limsup$ -inequality.

Proposition 4.3.

In the setting of Theorem 3.2 there holds

[TABLE]

Proof.

If $u\notin\mathrm{GSBV}^{2}(\Omega)$ or $v\neq 1$ on some set with non-zero measure the assertion is obvious. We first show that the result holds for $u$ replaced by $w\in\mathrm{SBV}^{2}(\Omega)\cap L^{\infty}(\Omega)$ for which 1.–3. in Theorem 2.2 (replacing $w_{j}$ by $w$ ) hold.

For this purpose choose for every $\varepsilon>0$ some $\delta_{\varepsilon}>0$ such that $\frac{\eta_{\varepsilon}}{\delta_{\varepsilon}}\to 0$ as $\varepsilon\to 0$ but still $\delta_{\varepsilon}\varphi_{\varepsilon}(W_{\varepsilon}(0))\to 0$ as $\varepsilon\to 0$ , for instance

[TABLE]

Take some smooth cutoff function $\phi\colon\mathbb{R}\to[0,1]$ with $\phi=1$ on $B_{\frac{1}{2}}(0)$ and $\phi=0$ on $\Omega\setminus B_{1}(0)$ , and define $\tau(x)=\operatorname{dist}(x,S_{w})$ for all $x\in\Omega$ . Then, we set $\phi_{\varepsilon}(x)=\phi(\tau(x)/\delta_{\varepsilon})$ for all $x\in\Omega$ , and we fix for every $\varepsilon>0$ the function $w_{\varepsilon}=(1-\phi_{\varepsilon})w$ , for which holds $w_{\varepsilon}\in H^{1}(\Omega)$ , $w_{\varepsilon}=w$ on $\Omega\setminus B_{\delta_{\varepsilon}}(S_{w})$ and $w_{\varepsilon}\to w$ in $L^{1}(\Omega)$ as $\varepsilon\to 0$ . Furthermore we define

[TABLE]

Since $\overline{S_{w}}$ is polyhedral there holds $\mathcal{H}^{n-1}(\partial B_{\delta_{\varepsilon}}(S_{w})\cap\Omega)<\infty$ . Consequently, we have $v_{\varepsilon}\in\mathrm{BV}(\Omega;[0,1])$ for all $\varepsilon>0$ .

With this at hand, recalling [A5], we get

[TABLE]

By the choice of $w_{\varepsilon}$ , the fact that $\lVert w\rVert_{L^{\infty}(\Omega)}\leq M$ and that $\lvert\nabla\tau(x)\rvert=1$ a.e. on $\Omega$ (see [29, Lemma 3.2.34]) we get on $B_{\delta_{\varepsilon}}(S_{w})$

[TABLE]

which implies

[TABLE]

with $C=2M^{2}\lVert\phi^{\prime}\rVert^{2}_{L^{\infty}(\Omega)}$ independent of $\varepsilon$ . The first and the last term obviously converge to 0 as $\varepsilon\to 0$ . For the second term we remark that for a polyhedral set, the Hausdorff measure coincides with the Minkowski content (see, e.g., [29, Theorem 3.2.29]), so that

[TABLE]

As a consequence, recalling that $\frac{\eta_{\varepsilon}}{\delta_{\varepsilon}}\to 0$ we get

[TABLE]

and therefore

[TABLE]

Additionally, (4.11) and $\delta_{\varepsilon}\varphi_{\varepsilon}(W_{\varepsilon}(0))\to 0$ as $\varepsilon\to 0$ imply

[TABLE]

Furthermore, there holds

[TABLE]

which is again due to $\overline{S_{w}}$ being a polyhedral set.

Applying the previous three convergence statements in (4.10) together with the limit behaviour of $\varphi_{\varepsilon}(W_{\varepsilon}(1))$ , $\psi_{\varepsilon}(0)$ and $c_{\varepsilon}$ from [A2] and [A3], we get

[TABLE]

If $u\in\mathrm{GSBV}^{2}(\Omega)$ we have for every $M>0$ that $u^{M}\in\mathrm{SBV}^{2}(\Omega)\cap L^{\infty}(\Omega)$ with $u^{M}\coloneq(-M)\vee u\wedge M$ , and we can find a sequence $(w_{j})$ in $\mathrm{SBV}^{2}(\Omega)\cap L^{\infty}(\Omega)$ such that 1.–6. in Theorem 2.2 (replacing $u$ by $u^{M}$ ) holds. Together with the lower semi-continuity of $\operatorname*{\Gamma-lim\,sup}F_{\varepsilon}$ in $L^{1}(\Omega)\times L^{1}(\Omega)$ and (4.12) we deduce

[TABLE]

Obviously, there holds $\lVert\nabla u^{M}\rVert_{L^{2}(\Omega)}\leq\lVert\nabla u\rVert_{L^{2}(\Omega)}$ , and from $S_{u}=\bigcup_{M>0}S_{u^{M}}$ (see Section 2.3) follows that $\mathcal{H}^{n-1}(S_{u^{M}})\leq\mathcal{H}^{n-1}(S_{u})$ . Thus, using again the lower semi-continuity of $\operatorname*{\Gamma-lim\,sup}F_{\varepsilon}$ we get

[TABLE]

which concludes the proof. ∎

The proof of Theorem 3.2 is now a direct consequence of Proposition 4.2 and Proposition 4.3.

5. Numerical Examples

The aim of this section is to numerically compare our new approximation from Corollary 3.6 with the classical Ambrosio-Tortorelli approach. We aim for a simple and easy to implement algorithm in order to illustrate the differences between those two models and justify our theory. As an application for the numerical computations we choose the image segmentation problem already described in the introduction.

Thus, for $\Omega\subset\mathbb{R}^{n}$ being non-empty, open, bounded and with Lipschitz boundary, we seek to minimize the following functional with respect to $u\in\mathrm{SBV}^{2}(\Omega)\cap L^{\infty}(\Omega)$

[TABLE]

where $g\in L^{\infty}(\Omega)$ is the original image and $\alpha,\beta,\gamma>0$ are the parameters influencing the smoothing and segment detection in the solution. They have, of course, to be chosen with care in order to get a sensible result.

Using now Corollary 3.6 we can approximately minimize $E$ by minimizing

[TABLE]

for small $\varepsilon>0$ , which we also refer to as the $\mathrm{BV}$ -model.

On the other hand we consider the elliptic approximation (1.3), introduced in [10]:

[TABLE]

for $u\in H^{1}(\Omega)$ and $v\in H^{1}(\Omega;[0,1])$ , which we refer to as the $H^{1}$ -model (note that we “redefined” $\mathcal{AT}_{\varepsilon}$ as in the following, we will only use (5.3) such that there is no chance of confusion).

For the discretization of these functionals we consider a 2-dimensional image with its natural pixel grid with pixel length $h>0$ . If the image is given by $M\times N$ pixels, we use the discrete grid $\Omega_{h}=\{h,\ldots,Mh\}\times\{h,\ldots Nh\}$ and we identify the piecewise constant functions $u,g,v$ as elements in the Euclidean space $\mathbb{R}^{M\times N}$ . Precisely, one sets $u=\sum_{ij}u_{ij}\mathbbm{1}_{[(i-1)h,ih)\times[(j-1)h,jh)}$ for $(u_{ij})\in\mathbb{R}^{M\times N}$ , where $\mathbbm{1}_{A}$ denotes the characteristic function of $A\subset\mathbb{R}^{2}$ , i.e., $\mathbbm{1}_{A}=1$ on $A$ and $\mathbbm{1}_{A}=0$ on $\mathbb{R}^{2}\setminus A$ .

For the discretization of the appearing gradients and the total variation we use a finite difference scheme. For this purpose we define the finite difference operator with zero Neumann boundary condition $\nabla_{h}\colon\mathbb{R}^{M\times N}\to\mathbb{R}^{2\times M\times N}$ by

[TABLE]

with

[TABLE]

Furthermore, we denote the adjoint of $\nabla_{h}$ by $-\operatorname{div}_{h}$ , i.e. for $w\in\mathbb{R}^{2\times M\times N}$ the operator $\operatorname{div}_{h}\colon\mathbb{R}^{2\times M\times N}\to\mathbb{R}^{M\times N}$ is defined by

[TABLE]

where for all $u\in\mathbb{R}^{M\times N}$

[TABLE]

For functions $u,v\in\mathbb{R}^{M\times N}$ , operations such as the product $uv$ (or $u\cdot v$ ), the minimum $u\wedge v$ , the maximum $u\vee v$ , and the square $u^{2}$ are always meant to be element-wise. With $\lVert u\rVert_{2}$ , $\lVert u\rVert_{1}$ and $\lVert u\rVert_{\infty}$ we respectively refer to the Frobenius norm, the $\ell^{1}$ -norm of $u$ vectorized, and the maximum norm of $u$ . The Frobenius inner product of $u$ and $v$ is written as $\langle u,v\rangle$ . For any field $q=(q^{(1)},q^{(2)})\in\mathbb{R}^{2\times M\times N}$ , like $\nabla_{h}u$ for $u\in\mathbb{R}^{M\times N}$ , we denote by $\lvert q\rvert$ the Euclidean norm along the first axis, i.e. $\lvert q\rvert\in\mathbb{R}^{M\times N}$

[TABLE]

With this strategy we can define the discretized versions of (5.2) and (5.3), respectively, for all $u,v\in\mathbb{R}^{M\times N}$ by

[TABLE]

and

[TABLE]

The symbol $\mathbbm{1}$ refers to the discretized function that is one almost everywhere. Note that we neglected the factor $h^{2}$ in the functionals since it does not change their minimum. Moreover, we chose $\eta_{\varepsilon}=0$ here, because in the discrete setting, the problem of finding a minimizer stays well-posed for this choice.

*Remark 5.1**.*

The choice of the recovery sequence in the proof of Proposition 4.3 suggests that the width of the detected contours represented by the phase field variable $v$ correlates with the parameter $\varepsilon$ . The precise relation between $\varepsilon$ and the width of the phase field is, however, not known. Examining the structure of the approximating functionals, we expect that it depends, in particular, on the trade-off between the two terms $v^{2}\lVert\nabla u\rVert^{2}_{\infty}$ and $\frac{1}{4\varepsilon}(1-v)$ .

Although, we would like to have the width of the phase field and therefore $\varepsilon$ extremely small, there is a limit of choice depending on the pixel size $h$ . To be more precise, choosing $h_{\varepsilon}>0$ depending on $\varepsilon$ , it is well known that $\mathcal{AT}_{\varepsilon}^{h_{\varepsilon}}$ $\Gamma$ -converges as $\varepsilon\to 0$ , provided that $h_{\varepsilon}/{\varepsilon}\to 0$ as $\varepsilon\to 0$ (see [15, 12]). We believe that a corresponding statement is also true for the considered $\mathrm{BV}$ -phase field approximation. A study of this is, however, outside the scope of the present paper.

The difficulty in finding a minimizer lies in the non-convex, and for $G_{\varepsilon}^{h}$ also non-smooth, structure. In previous works an alternating minimization scheme has been commonly used, exploiting the fact that the functionals are convex in each variable separately (see [15, 11, 1]). However, in this work we choose a more recent approach, which is the proximal alternating linearized minimization (in short PALM) presented in [13]. This algorithm is a form of an alternating gradient descent procedure, for which we do not have to solve any linear equation. This makes the algorithm also faster than the alternating minimization scheme, especially for rather large images. Our experience also showed no significant difference in the results.

For the PALM algorithm one uses the fact that the objective functional can be written as $J(u,v)+K(u)+H(v)$ . Then, for some initial value $u^{0},v^{0}\in\mathbb{R}^{M\times N}$ we set for each $k\in\mathbb{N}$

[TABLE]

where $t_{k},s_{k}>0$ . By $\operatorname{prox}_{t}^{g}$ we denote the proximal operator with step size $t>0$ :

[TABLE]

For the right choices of the step sizes $t_{k}$ and $s_{k}$ above one can show that this scheme converges to a critical point of $J(u,v)+K(u)+H(v)$ as $k\to\infty$ (see [13, Proposition 3.1]). Namely, we need to choose $t_{k}=\frac{\theta_{1}}{L_{1}(v_{k-1})}$ and $s_{k}=\frac{\theta_{2}}{L_{2}(u_{k})}$ for some $\theta_{1},\theta_{2}\in(0,1)$ , where $L_{1}(v)$ and $L_{2}(u)$ are Lipschitz constants of $u\mapsto\nabla_{u}J(u,v)$ and $v\mapsto\nabla_{v}J(u,v)$ , respectively. Unfortunately, convergence rates are not known, so that as a stopping criterion, we are limited to measure the change of the variables in each iteration. We stop the scheme when this change drops under a specified threshold or if a certain number of iterations is reached.

We will now have a closer look on how the algorithm looks like for $G_{\varepsilon}^{h}$ and $\mathcal{AT}_{\varepsilon}^{h}$ separately.

$\mathrm{BV}$ -model

We write $G_{\varepsilon}^{h}(u,v)=J(u,v)+K(u)+H(v)$ with

[TABLE]

and

[TABLE]

We have

[TABLE]

Since the operator norm of $\nabla_{h}$ is strictly below $\frac{\sqrt{8}}{h}$ (see, e.g. [21, 19]), we can choose for some $\theta\in(0,1)$

[TABLE]

such that $t=t_{k}$ is constant throughout the algorithm.

As a simple computation shows, solving (5.4) is then equivalent to

[TABLE]

By completing squares and ignoring constant terms the problem (5.5) can be equivalently reformulated to

[TABLE]

with $\bar{v}^{k}=v^{k-1}-s_{k}\alpha v^{k-1}\lvert\nabla_{h}u^{k}\rvert^{2}$ . Since the non-smooth term $\lVert\lvert\nabla_{h}v\rvert\rVert_{1}$ is still present, this minimization can not be solved directly. Instead we tackle the problem with the algorithm introduced by A. Chambolle and T. Pock in [22], solving the corresponding primal-dual problem. Therefore, we define for all $v\in\mathbb{R}^{M\times N}$ and $w\in\mathbb{R}^{2\times M\times N}$ the functions

[TABLE]

such that (5.9) is equivalent to

[TABLE]

Here, $\nabla_{1}$ is the forward difference operator $\nabla_{h}$ for $h=1$ .

The corresponding primal-dual saddle point problem is given by

[TABLE]

where $Q^{\ast}_{k}$ denotes the convex conjugate of $Q_{k}$ , i.e., $Q_{k}^{\ast}=\chi_{\{\lVert\lvert\cdot\rvert\rVert_{\infty}\leq\frac{\gamma s_{k}}{2h}\}}$ . Clearly, for any solution $(p,q)$ of (5.11) we have that $v^{k}=p$ is a solution of (5.10). We solve (5.11) with [22, Algorithm 1]. Namely, for $0<\tau^{2}\leq\frac{1}{8}$ and for some $p_{k}^{0}\in\mathbb{R}^{M\times N}$ , $q_{k}^{0}\in\mathbb{R}^{2\times M\times N}$ as well as $\hat{p}_{k}^{0}\coloneq p_{k}^{0}$ we define for all $\ell\in\mathbb{N}$

[TABLE]

Then, [22, Theorem 1] guarantees the convergence of $(p_{k}^{\ell},q_{k}^{\ell})$ as $\ell\to\infty$ to a solution of (5.11). For a stopping criterion of the primal-dual iteration we consider the primal-dual gap which is for $p\in\mathbb{R}^{M\times N}$ and $q\in\mathbb{R}^{2\times M\times N}$ given by

[TABLE]

It vanishes if and only if $(p,q)$ solves (5.11). For this reason, we stop iteration (5.12)–(5.14) if the corresponding primal-dual gap is smaller than a certain tolerance.

We now continue with the precise computations of the primal-dual steps for the $\mathrm{BV}$ -phase field approximation. Since $Q_{k}^{\ast}$ is the indicator function of a convex set, the update step (5.12) is the projection of $q_{k}^{\ell-1}+\tau\nabla_{1}\hat{p}_{k}^{\ell-1}$ onto $\{\lVert\lvert\cdot\rvert\rVert_{\infty}\leq\frac{\gamma s_{k}}{2h}\}$ (cf. [22, Section 6.2]). Thus, we simply get

[TABLE]

The proximal operator appearing in (5.13) can be solved directly. Namely, we get

[TABLE]

with $\bar{p}_{k}^{\ell}=p_{k}^{\ell-1}+\tau\operatorname{div}_{1}q_{k}^{\ell}$ , which yields

[TABLE]

The primal-dual gap for $p_{k}^{\ell}$ and $q_{k}^{\ell}$ can be computed explicitly. Taking into account that $Q^{\ast}(q_{k}^{\ell})=0$ and

[TABLE]

with

[TABLE]

it is given by

[TABLE]

Summing up all the previous computations for our $\mathrm{BV}$ -phase field model, we get Algorithm 1 in the appendix, which is the numerical scheme as implemented.

$H^{1}$ -model (Ambrosio-Tortorelli)

For the elliptic approximation we use $J$ and $K$ as in (5.6) and only redefine $H$ by

[TABLE]

in order to obtain $\mathcal{AT}_{\varepsilon}^{h}(u,v)=J(u,v)+K(u)+H(v)$ . Clearly, $s_{k}$ and $t=t_{k}$ can also be chosen as before in (5.7). Hence, (5.4) results again in (5.8). The difference of the algorithm compared to the one for the $\mathrm{BV}$ -phase field appears in (5.5), which is now equivalent to

[TABLE]

Since this problem is sufficiently smooth it could be easily solved directly, by solving a linear system. Nevertheless, for a better comparability and for saving the effort of solving a large linear equation, we stay as close as possible to the algorithm used for the $\mathrm{BV}$ -model. Thus, we use again the primal-dual scheme as in (5.12)–(5.14), where this time we need to choose

[TABLE]

for $v\in\mathbb{R}^{M\times N}$ and

[TABLE]

for $w\in\mathbb{R}^{2\times M\times N}$ . Note, that we have $Q_{k}^{\ast}(w)=\frac{1}{2\mu}\lVert\lvert w\rvert\rVert_{2}^{2}$ and thus (5.12) yields

[TABLE]

and (5.13) results in

[TABLE]

The primal-dual gap for this approximation is given by

[TABLE]

with

[TABLE]

Altogether, this yields Algorithm 2 in the appendix, which is the numerical scheme that we use for computations.

Numerical Results

With the presented algorithms we perform computations for two different images. For all numerical examples we fix the width of the images to $1$ . The pixel size $h$ then depends on the number of pixels and is given by $h=\frac{L}{\text{\emph{number of horizontal pixels}}}$ .

For the first computation we use the noisy image from Figure 1. The latter is generated by adding Gaussian noise of standard deviation $0.1$ and clipping the result to the original image range $[0,1]$ . In this computation, the input image $g$ corresponds to this noisy image and we only change the approximating variable $\varepsilon$ , in order to investigate its influence, while fixing the other parameters for the algorithms as indicated in Table 1. The result can be observed in Figure 2.

One can clearly see that the $\mathrm{BV}$ -model produces almost binary phase fields, i.e. $v$ takes only the values [math] (corresponding to a black pixel) and $1$ (corresponding to a white pixel). In other words these phase fields are much sharper than the ones produced by the $H^{1}$ -model. Moreover, we observe that $\varepsilon$ can be chosen larger when using the $\mathrm{BV}$ -model in order to obtain a result that is comparable to the $H^{1}$ -model.

Besides the comparison of the two models one can also observe, that in both approximations of the Mumford-Shah functional, only few edges are detected if $\varepsilon$ is too small. Whereas, if $\varepsilon$ is relatively large, the contours become rather wide. These effects are well-known and have already been mentioned in Remark 5.1, from which we also expect that for small values of $\varepsilon$ , the phase field may detect the edges again, when reducing $h$ . Also this can be confirmed from Figure 3, where we use the same image but this time with 512 $\times$ 512 pixels keeping the width of the image domain fixed to $1$ as above, resulting in the value of $h$ being halved.

Figure LABEL:fig:sailing shows another picture with 512 $\times$ 512 pixel size. To the original image we again add Gaussian noise (noise level: $0.1$ ). This noisy image serves as the input data $g$ for our algorithms. Besides $\alpha$ and $\gamma$ , the parameters have a been chosen like in Table 1.

Acknowledgements

This work was supported by the International Research Training Group IGDK 1754 “Optimization and Numerical Analysis for Partial Differential Equations with Nonsmooth Structures”, funded by the German Research Council (DFG) and the Austrian Science Fund (FWF):[W 1244-N18].

Appendix A Auxiliary statements

Lemma A.1.

Let $\mu$ be a signed Radon measure on $\mathbb{R}$ , $\psi:\mathbb{R}\to[0,\infty]$ a proper, convex and lower semi-continuous function and $\eta\in C_{c}^{\infty}(\mathbb{R};[0,\infty))$ a mollifier, i.e., $\int_{\mathbb{R}}\eta\,\mathrm{d}{x}=1$ . Then,

[TABLE]

where $\mu=\frac{\mathrm{d}{\mu}}{\mathrm{d}{\mathcal{L}^{1}}}\mathcal{L}^{1}+\mu^{s}$ denotes the Lebesgue decomposition of $\mu$ and $\psi^{\infty}(s)=\lim_{t\to\infty}\frac{\psi(st)}{t}$ is the recession function of $\psi$ .

Proof.

Fix $x\in\mathbb{R}$ , $t>0$ and choose $\lvert\mu\rvert_{x,t}=\eta(x-\cdot)(\mathcal{L}^{1}+\frac{1}{t}\lvert\mu^{s}\rvert)$ as well as $\mu_{x}=\eta(x-\cdot)\mu$ . Then, $\lvert\mu\rvert_{x,t}(\mathbb{R})=1+\frac{1}{t}\int_{\mathbb{R}}\eta(x-\cdot)\,\mathrm{d}{\lvert\mu^{s}\rvert}$ and $\mu_{x}=\frac{\mathrm{d}{\mu}}{\mathrm{d}{\mathcal{L}^{1}}}\eta(x-\cdot)\mathcal{L}^{1}+t\frac{\mathrm{d}{\mu^{s}}}{\mathrm{d}{\lvert\mu^{s}\rvert}}\frac{1}{t}\eta(x-\cdot)\lvert\mu^{s}\rvert$ , such that Jensen’s inequality yields

[TABLE]

Since $\frac{\mathrm{d}{\mu^{s}}}{\mathrm{d}{\lvert\mu^{s}\rvert}}$ is either $1$ or $-1$ $\lvert\mu^{s}\rvert$ -almost everywhere, the rightmost integral reads as $\frac{1}{t}\psi(t)\int_{I_{+}}\eta(x-\cdot)\,\mathrm{d}{\lvert\mu^{s}\rvert}+\frac{1}{t}\psi(-t)\int_{I_{-}}\eta(x-\cdot)\,\mathrm{d}{\lvert\mu^{s}\rvert}$ where $I_{+}=\bigl{\{}x\in\mathbb{R}\colon\frac{\mathrm{d}{\mu^{s}}}{\mathrm{d}{\lvert\mu^{s}\rvert}}(x)=1\bigr{\}}$ and $I_{-}=\bigl{\{}x\in\mathbb{R}\colon\frac{\mathrm{d}{\mu^{s}}}{\mathrm{d}{\lvert\mu^{s}\rvert}}(x)=-1\bigr{\}}$ . Clearly, as $t\to\infty$ , this expression converges to $\int_{\mathbb{R}}\psi^{\infty}\bigl{(}\frac{\mathrm{d}{\mu^{s}}}{\mathrm{d}{\lvert\mu^{s}\rvert}}\bigr{)}\,\eta(x-\cdot)\,\mathrm{d}{\lvert\mu^{s}\rvert}$ (possibly to $\infty$ ). Since $\lim_{t\to\infty}\lvert\mu\rvert_{x,t}(\mathbb{R})=1$ , by lower semi-continuity of $\psi$ ,

[TABLE]

Integrating both sides over $\mathbb{R}$ with respect to $x$ and interchanging order on the right-hand side then yields the result. ∎

Lemma A.2.

Let $I:=(a,b)\subset\mathbb{R}$ be a bounded open interval, $v\in\mathrm{BV}(I;[0,1])$ and $\varepsilon>0$ . Then,

[TABLE]

with $W_{\varepsilon}$ , $\varphi_{\varepsilon}$ and $\psi_{\varepsilon}$ according to [A1], [A2] and [A3], respectively, and $\Phi_{\varepsilon}(s)=\int_{0}^{s}W_{\varepsilon}(t)\,\mathrm{d}t$ for $s\in[0,1]$ .

Proof.

Denote by $v(a)=\lim_{\rho\to 0}\frac{1}{\rho}\int_{a}^{a+\rho}v(x)\,\mathrm{d}{x}$ and $v(b)=\lim_{\rho\to 0}\frac{1}{\rho}\int_{b-\rho}^{b}v(x)\,\mathrm{d}{x}$ and extend $v$ outside of $I$ by $v(x)=v(a)$ for $x\leq a$ and $v(x)=v(b)$ for $x\geq b$ . Then, $v\in\mathrm{BV}_{loc}(\mathbb{R})$ with $\mathrm{D}v$ the zero extension of $\mathrm{D}v$ on $I$ . Choose a mollifier $\eta\in C_{c}^{\infty}(\mathbb{R};[0,\infty))$ , $\int_{\mathbb{R}}\eta\,\mathrm{d}{x}=1$ and denote by $\eta_{\delta}(x)=\frac{1}{\delta}\eta(\frac{x}{\delta})$ for $\delta>0$ . Then, each $v_{\delta}=v\ast\eta_{\delta}$ is in $C^{\infty}(\overline{I};[0,1])$ and by classical differentiation, the Fenchel inequality and [A3],

[TABLE]

We have $v_{\delta}\to v$ in $L^{1}(I)$ as $\delta\to 0$ , so by continuity of $W_{\varepsilon}$ and $\varphi_{\varepsilon}$ (as a consequence of convexity and finiteness on $W_{\varepsilon}([0,1])$ ), one can conclude that $\int_{a}^{b}\varphi_{\varepsilon}\bigl{(}W_{\varepsilon}(v_{\delta})\bigr{)}\,\mathrm{d}{x}\to\int_{a}^{b}\varphi_{\varepsilon}\bigl{(}W_{\varepsilon}(v)\bigr{)}\,\mathrm{d}{x}$ as $\delta\to 0$ . Denoting by $\tilde{\psi}_{\varepsilon}(t)=\psi_{\varepsilon}(\lvert t\rvert)$ for $t\in\mathbb{R}$ yields a convex function since $\psi_{\varepsilon}$ is increasing on $[0,\infty)$ , so applying Lemma A.1 yields

[TABLE]

By continuity of $\Phi_{\varepsilon}$ , one can further conclude that $\Phi_{\varepsilon}\circ v_{\delta}\to\Phi_{\varepsilon}\circ v$ in $L^{1}(I)$ as $\delta\to 0$ . The above then implies that $\mathrm{D}(\Phi_{\varepsilon}\circ v_{\delta})$ as a sequence of $\delta$ is bounded in the space of Radon measures on $I$ , yielding a weak*-convergent subsequence. By strong-weak*-closedness of the weak derivative, the limit has to coincide with $\mathrm{D}(\Phi_{\varepsilon}\circ v)$ . This holds for each subsequence, such that in fact, $\mathrm{D}(\Phi_{\varepsilon}\circ v_{\delta})$ converges weakly* to $\mathrm{D}(\Phi_{\varepsilon}\circ v)$ as $\delta\to 0$ . Thus, using weak* lower semi-continuity, we obtain

[TABLE]

which, together with the above, yields the desired estimate. ∎

Appendix B Pseudo Codes

Bibliography39

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] S. Almi and S. Belz. Consistent finite-dimensional approximation of phase-field models of fracture. Ann. Mat. Pura Appl. , Dec. 2018.
2[2] S. Almi, S. Belz, S. Micheletti, and S. Perotto. A dimension-reduction model for brittle fractures on thin shells with mesh adaptivity. submitted, ar Xiv:2004.08871 [math.NA], 2020.
3[3] S. Almi, S. Belz, and M. Negri. Convergence of discrete and continuous unilatersl flows for Ambrosio-Tortorelli energies and application to mechanics. M 2AN Math. Model. Numer. Anal. , Dec. 2018.
4[4] S. Almi and M. Negri. Analysis of staggered evolutions for nonlinear energies in phase field fracture. Archive for Rational Mechanics and Analysis , 236(1):189–252, 2020.
5[5] L. Ambrosio. A compactness theorem for a new class of functions of bounded variation. Boll. Un. Mat. Ital. B (7) , 3(4):857–881, 1989.
6[6] L. Ambrosio. Existence theory for a new class of variational problems. Arch. Rational Mech. Anal. , 111(4):291–322, 1990.
7[7] L. Ambrosio. A new proof of the SBV compactness theorem. Calc. Var. Partial Differential Equations , 3(1):127–137, 1995.
8[8] L. Ambrosio, N. Fusco, and D. Pallara. Functions of bounded variation and free discontinuity problems . Oxford Mathematical Monographs. The Clarendon Press, Oxford University Press, New York, 2000.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Approximation of the Mumford-Shah Functional by Phase Fields of Bounded Variation

Abstract.

Key words and phrases:

2010 Mathematics Subject Classification:

1. Introduction

2. Preliminaries and Notation

2.1. Measure theory

2.2. Γ\GammaΓ-convergence

Proposition 2.1**.**

2.3. Functions of bounded variation

Theorem 2.2**.**

Theorem 2.3**.**

Corollary 2.4**.**

2.4. Convex functions

3. Main Result

Assumption 3.1**.**

Theorem 3.2**.**

Corollary 3.3**.**

Proof.

Theorem 3.4**.**

Proof.

Remark 3.5*.*

Corollary 3.6**.**

Proof of Corollary 3.6.

4. Proof of Theorem 3.2

Proposition 4.1**.**

Proof.

Proposition 4.2**.**

Proof.

Proposition 4.3**.**

Proof.

5. Numerical Examples

Remark 5.1*.*

BV\mathrm{BV}BV-model

H1H^{1}H1-model (Ambrosio-Tortorelli)

Numerical Results

Acknowledgements

Appendix A Auxiliary statements

Lemma A.1**.**

Proof.

Lemma A.2**.**

Proof.

Appendix B Pseudo Codes

2.2. $\Gamma$ -convergence

Proposition 2.1.

Theorem 2.2.

Theorem 2.3.

Corollary 2.4.

Assumption 3.1.

Theorem 3.2.

Corollary 3.3.

Theorem 3.4.

*Remark 3.5**.*

Corollary 3.6.

Proposition 4.1.

Proposition 4.2.

Proposition 4.3.

*Remark 5.1**.*

$\mathrm{BV}$ -model

$H^{1}$ -model (Ambrosio-Tortorelli)

Lemma A.1.

Lemma A.2.