Approximation of the Mumford-Shah Functional by Phase Fields of Bounded Variation
Sandro Belz, Kristian Bredies

TL;DR
This paper introduces a novel phase field approximation of the Mumford-Shah functional using bounded variation functions, enabling sharper image segmentation results compared to traditional methods.
Contribution
The paper proposes a new phase field model based on BV functions for Mumford-Shah approximation, enhancing image segmentation quality.
Findings
The BV-based approximation produces sharper phase fields.
Numerical methods incorporate total variation minimization.
Comparison shows improved segmentation sharpness.
Abstract
In this paper we introduce a new phase field approximation of the Mumford-Shah functional similar to the well-known one from Ambrosio and Tortorelli. However, in our setting the phase field is allowed to be a function of bounded variation, instead of an -function. In the context of image segmentation, we also show how this new approximation can be used for numerical computations, which contains a total variation minimization of the phase field variable, as it appears in many problems of image processing. A comparison to the classical Ambrosio-Tortorelli approximation, where the phase field is an -function, shows that the new model leads to sharper phase fields.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Approximation of the Mumford-Shah Functional by Phase Fields of Bounded Variation
Sandro Belz
Department of Mathematics, Technical University of Munich
Boltzmannstraße 3, 85748 Garching, Germany
and
Kristian Bredies
Institute of Mathematics and Scientific Computing, University of Graz
Heinrichstraße 36, 8010 Graz, Austria
Abstract.
In this paper we introduce a new phase field approximation of the Mumford-Shah functional similar to the well-known one from Ambrosio and Tortorelli. However, in our setting the phase field is allowed to be a function of bounded variation, instead of an -function. In the context of image segmentation, we also show how this new approximation can be used for numerical computations, which contains a total variation minimization of the phase field variable, as it appears in many problems of image processing. A comparison to the classical Ambrosio-Tortorelli approximation, where the phase field is an -function, shows that the new model leads to sharper phase fields.
Key words and phrases:
Mumford-Shah, free-discontinuity problem, -convergence, phase field, image segmentation, image denoising
2010 Mathematics Subject Classification:
49J45, 26A45, 68U10
1. Introduction
The Mumford-Shah functional has been introduced by D. Mumford and J. Shah in [37] in the context of image segmentation. For a given image, , where represents the image domain, it is given by
[TABLE]
where are parameters, free to choose. One wants to minimize the functional with respect to , being the segmentally denoised approximation of , and closed, describing the contours of the segments. For this functional appeared once more in [30] in the context of fracture mechanics. There, models the displacement function, and being closed represents the fracture set. The minimization is then restricted to some Dirichlet boundary condition.
As usual in the theory of free-discontinuity problems (see [8, 16]) the Mumford-Shah functional (1.1) is relaxed to the space of special functions of bounded variation (see Section 2.3 for more details on these functions), where the set is replaced by the discontinuity set . Namely, instead of (1.1) one considers
[TABLE]
for , the set of special functions of bounded variation. In this setting the existence of the minimizers is well-known and follows from compactness properties of and some lower semi-continuity properties (see [7, 5, 6]), using the direct method in the calculus of variations. Furthermore, by the regularity property shown in [26] we know that for any minimizer of (1.2) the pair minimizes (1.1).
As already mentioned, in the case of fracture mechanics, we usually have and -penalization is replaced by a Dirichlet boundary condition. In general, the functional must then be defined on , the set of generalized special functions of bounded variation (see Section 2.3), in order to obtain the existence of a minimizer. This is due to the requirement of a uniform bound of the minimizing sequence in the direct method for applying the above-mentioned compactness properties in . Only for , this -bound is automatically achieved, whereas for one has to fall back to .
For numerical computations some variational approximations in terms of -convergence (see Section 2.2) turned out to be very useful. It guarantees that a convergent sequence of minimizers of the approximating functionals converge to minimizers of the -limit. We firstly discuss in Theorem 3.2 an approximation for . Hence, we consider the functional
[TABLE]
One of the first and most popular results in this direction was given by L. Ambrosio and V. M. Tortorelli in [10]. They introduced the functionals
[TABLE]
for and and showed via a -convergence argument that any limit point of a sequence of minimizers of is a minimizer of , provided that . Many other approximations based on this result have been proven. Just recently, we proved that the Euclidean norms of the gradients can be replaced by Riemannian norms (see [2]). This result finds application in fracture mechanics applied to surfaces. Another approach considering higher order terms of the phase field has been studied e.g. in [14] and [20]. What happens with the approximation when does not converge to zero is investigated in [25] and [34]. A totally different idea of approximating by finite differences was proposed by E. De Giorgi and proven by M. Gobbino in [32]. In [18] A. Braides and G. Dal Maso used non-local functionals depending on the average of the gradient of on small balls. From the work presented in [16] one gets an approximation of for the following functional with small :
[TABLE]
for and with and being the Hölder conjugate of .
In the approximations (1.3), (1.4) and in the functionals we are going to study in this paper the additional function works as a phase field variable describing the discontinuity set of . To be more precise, for small the minimizing function is close to [math] where is “steep” or jumps, which means in the context of fracture mechanics the presence of a crack and in the context of image segmentation the presence of a contour. Elsewhere, the phase field variable is close to 1 and is expected to be “flat” in this area. In practice the weights of the different integral terms declare what is meant to be “steep” or “flat”.
In this paper we present a new approximation of the Mumford-Shah functional, allowing the phase field variable to be in , the set of functions of bounded variation. Namely, as a special case of our main result we consider the functionals
[TABLE]
for and , which -converge in some sense to and represent the case with in (1.4). From there we derive the required setting for .
In this way the phase field variable can have jumps, which is exploited in the proof of the -inequality (see Proposition 2.2), when constructing the recovery sequence for the -convergence result. Moreover, we expect from this fact that the phase fields become somewhat sharper than the ones obtained from (1.3). We approve this expectation with some numerical computations in the context of image segmentation. The application of this theory to fracture mechanics remains for future work, which includes the studying the convergence behaviour of a time-discrete evolution as it was done – and is still ongoing – for the classical approach of Ambrosio and Tortorelli. For more details on this topic we refer to [31, 3, 1, 38, 39, 36, 4].
The paper is structured as follows: In Section 2 we start with some preliminaries recalling the necessary technical issues. For the versed reader this section might be skipped or only used as a reference text. In Section 3 we formulate our main result, Theorem 3.2, from which we directly infer all other necessary theorems and corollaries. Section 4 is then dedicated to the proof of Theorem 3.2 and in Section 5 we provide some numerical comparison of our new model and the classical Ambrosio-Tortorelli approximation.
2. Preliminaries and Notation
In this section, we collect the notation and the well-known results from the literature which are used in this paper.
With we denote the Euclidean ball with radius and center . For some the set refers to the -neighborhood of . The set is the -dimensional sphere in . At some places it is convenient to use the short notation and for and , respectively.
The essential supremum and the essential infimum of some measurable function is written as and , respectively. The essential support of a measurable function is denoted by .
2.1. Measure theory
For any set we denote by the -dimensional Lebesgue measure and by the -dimensional Hausdorff measure. Instead of we also use the symbol for the counting measure. For a (signed, vector-valued) measure we write for its total variation.
2.2. -convergence
For some sequence of functionals and a functional defined on some metric space we say that -converges to as and write if there holds the
**-inequality: **
for all and all sequences in with there holds
[TABLE]
**-inequality: **
for all there exists a sequence in such that and
[TABLE]
One often defines
[TABLE]
Then the -inequality is equivalent to and the -inequality is equivalent to . Note that as well as are lower semi-continuous.
If one has a family of functionals for the definition is adapted in the usual way, i.e. -converges to as (for some ) if -converges to for all sequences in with .
The most important property of -convergent sequences is the convergence of minimizers to a minimizer of the limit functional, which is stated in the following proposition.
Proposition 2.1**.**
Let , where , be a sequence of functionals -converging to with respect to the metric space . Assume that for some compact set . Then, there holds . Furthermore, for any sequence in converging to with we have .
If and , a sequence , for which (2.2) holds, is called a recovery sequence for , and there clearly holds . It is actually the case that a convergent sequence of minimizers is a recovery sequence for the minimizer of the -limit. For this reason knowing the recovery sequences provides lots of information about the structure of the limit behaviour of the functional sequence.
For more details on the concept of -convergence we refer to [17] and [24].
2.3. Functions of bounded variation
In the following we describe the concept and some essential results of functions of bounded variation. For an extensive monograph on this topic we refer to [8]. A more basic introduction can be found in [28].
Let be non-empty and open for the rest of this section. The set of functions of bounded variation, in short , contains all functions whose distributional derivative is a Radon measure, denoted by , i.e. there holds
[TABLE]
Defining the total variation
[TABLE]
we obtain from the Riesz representation theorem that (2.3) is equivalent to. Furthermore, there holds for all .
For any measurable function we define for all the upper and lower approximate limit, respectively, by
[TABLE]
For all there obviously holds . If we write for their common value . The set is the discontinuity set containing all those points for which there holds .
In what follows let . Then, has Lebesgue measure 0 and for -almost all points one can find a unit normal vector such that u^{+}(x)=\bigl{(}u\rvert_{H^{+}(x)}\bigr{)}^{\ast}(x) and u^{-}(x)=\bigl{(}u\rvert_{H^{-}(x)}\bigr{)}^{\ast}(x) with
[TABLE]
If this is the case one says that is a jump point. We call a precise representative of if for all and for all jump points .
For functions of bounded variation on the real line we actually have that every point in is a jump point. Furthermore, on an open interval the pointwise variation of and the variation as defined in (2.4) coincide. Precisely, for and there holds
[TABLE]
For any one can split the measure in the following way
[TABLE]
where denotes the absolutely continuous part of with respect to the Lebesgue measure. Therefore, with we denote its density function, which we also call the approximate gradient of . With we denote the jump part of , which can be written as , and is the Cantor part.
The set of special functions of bounded variation, denoted by , contains those functions of bounded variation whose Cantor part is zero, i.e. we have . The singular part of such functions is therefore only concentrated on the set of jump points.
A measurable function is a generalized special function of bounded variation, where we write , if any truncation of is locally a special function of bounded variation, i.e. for all , with . Note that for we cannot define as above, because the distributional derivative does not need to be a measure on that space. However, is well defined for all and converges pointwise a.e. as . Thus, we simply define for a.e. . Furthermore, one can show that . These results and more details can be found in [8, Section 4.5] and the references therein.
Moreover, we will use the following two subspaces of and defined for every by
[TABLE]
A density result, which plays an important role in the proof of the -inequality for our main assertion, is stated in the next theorem. It follows directly from [23, Theorem 3.1] and the following remarks therein.
Theorem 2.2**.**
Let be non-empty, open and bounded with Lipschitz boundary, and take . Then, there exists a sequence in such that
** 2. 2.
\displaystyle\mathcal{H}^{n-1}\bigl{(}\overline{S_{w_{j}}}\setminus S_{w_{j}}\bigr{)}=0\,,** 3. 3.
** 4. 4.
** 5. 5.
** 6. 6.
**
We now shortly introduce the concept of slicing, which is essential for the proof of the -inequality. For that purpose, let be open and bounded, and let be a unique normal vector. Then, we write for the projection of onto , and we set
[TABLE]
Furthermore, for any function and for -a.a. we can define for a.a. .
One can show the following important results revealing the connection between a function and its sliced functions . There are more general results for -functions, which are not needed in this context. The interested reader can find the details in [8, Section 3.11].
Theorem 2.3**.**
Let . Then if and only if for all there holds for -a.a. and
[TABLE]
Furthermore, if there holds for all , for -a.a. and for a.a. :
(u_{y}^{\xi})^{\prime}(t)=\bigl{\langle}\nabla u(y+t\xi),\xi\bigr{\rangle}, 2. 2.
, 3. 3.
, 4. 4.
\bigl{\lvert}\langle\mathrm{D}^{\ast}u,\xi\rangle\bigr{\rvert}(\Omega)=\int_{\Omega_{\xi}}\bigl{\lvert}\mathrm{D}^{\ast}u_{y}^{\xi}\bigr{\rvert}(\Omega^{\xi}_{y})\,\mathrm{d}\mathcal{L}^{n-1}(y)\quad\text{ for }\ast=a,j,c.
The following corollary directly follows by a truncation argument.
Corollary 2.4**.**
Let . Then if and only if for all there holds for -a.e. and
[TABLE]
2.4. Convex functions
Especially, for the numerical part of this paper we also need some theory about convex functions. A good reference for this topic is [33] and [27].
For the characteristic function over is given by on and on . It is a convex function if and only if is a convex set. For any function , bounded from below by some affine function, denotes its convex conjugate, i.e.
[TABLE]
where is set to outside of . This definition directly yields Fenchel’s inequality, which says
[TABLE]
We remark that is always convex and lower semi-continuous and the biconjugate is the lower semi-continuous convex hull of . Furthermore, is convex and lower semi-continuous if and only if .
We will also make use of the subdifferential of a function , which we denote by . It is given by
[TABLE]
If is differentiable in , we have .
3. Main Result
For our main result we need several, quite technical assumptions. In order to keep a better overview we first list them here.
Assumption 3.1**.**
Let . For each let
- [A1]
be continuous such that in as for some with . 2. [A2]
be a convex function such that and uniformly on for all , i.e. for all there exists such that for all and . 3. [A3]
be a convex function such that , and as for from [A1], on , where denotes the convex conjugate of (see Section 2.4), and for all and some , independent of . 4. [A4]
such that as .
Furthermore, assume that
- [A5]
is a Lipschitz continuous, non-decreasing function with and on .
We are now ready to state our main theorem.
Theorem 3.2**.**
Let be a non-empty, open, bounded set with Lipschitz boundary, let , , , , and be given as in Assumption 3.1. For each , we define the functional by
[TABLE]
for all and otherwise.
Moreover, define by
[TABLE]
Then with respect to the strong topology in .
For our application in image segmentation we aim for a minimization of . In the following corollary we add the missing -penalization term in the functionals and .
Corollary 3.3**.**
Let be a non-empty, open, bounded set with Lipschitz boundary, let , , , , and be given as in Assumption 3.1. Furthermore, let and , and let and be given as in Theorem 3.2. We define for every the functional
[TABLE]
Moreover, we define by
[TABLE]
Then, with respect to the strong topology in .
Proof.
Since is lower semi-continuous, the -inequality follows directly from Theorem 3.2. Hence, we have
[TABLE]
In order to show the -inequality it suffices to consider and a.e., since otherwise, the left hand side of (3.2) is and there is nothing to show.
From Theorem 3.2 we know that there exists a sequence in converging to as in the strong -topology such that
[TABLE]
We consider the truncated function sequence with , and note that in as . Therefore, we also have
[TABLE]
Furthermore, one can easily verify that , so that
[TABLE]
which is the required -inequality. ∎
In view of Proposition 2.1 the existence and compactness of minimizers of the approximating functionals needs to be shown in order to obtain their convergence to a minimizer of the functional . We give a rigorous proof in the following theorem.
Theorem 3.4**.**
In the setting of Corollary 3.3 a minimizer of exists for every . Furthermore, let be an infinitesimal sequence, and let be a minimizer of for every . Then, in , and there exists , such that, up to a subsequence, in , and minimizes .
Proof.
In order to show the existence of minimizers of we fix and take a minimizing sequence of , i.e.
[TABLE]
In view of for all and some as stated in Assumption [A3], it is easy to see that is bounded. Further, is bounded in , since . By the compactness properties of (see [8, Theorem 3.23]) and there exist subsequences of and (not relabeled) and functions and , such that in , converges sequentially weakly* (in the space of Radon measures) to and weakly in .
From Fatou’s Lemma and [8, Theorems 5.4 and 5.8] we get the lower semi-continuity of so that
[TABLE]
Hence, the pair minimizes .
Now let be a sequence converging to [math], and let the pair be a minimizer of for every . Then we simply have
[TABLE]
which implies, together with Assumption [A2], in as .
From a simple cut-off argument we get that . Since is Lipschitz continuous according to Assumption [A5], we have with obeying the chain rule for -functions (see [8, Theorem 3.99]). Further, the multiplication operation is continuously differentiable and Lipschitz continuous on bounded sets, thus, since both and are a.e. bounded, the product rule for -functions holds (see [8, Theorem 3.99]), giving . We moreover have
[TABLE]
Since , and are bounded, we can estimate, employing a weighted version of the Cauchy–Schwarz inequality and Young’s inequality,
[TABLE]
where depends on and .
By Assumption [A3], for all and some , , so
[TABLE]
with suitably chosen. Altogether, since is bounded, we obtain
[TABLE]
where here is a constant depending on , , and . Hence, is bounded.
Clearly, is pointwise a.e. bounded independent of , so by the compactness properties of (see [8, Theorem 3.23]) there exists a subsequence of (not relabeled) converging to 0, such that converges to some in . Since a.e. and is continuous, we also have that a.e. as . Since , the Dominated Convergence Theorem yields in .
The assertion now follows from Proposition 2.1 and Corollary 3.3. ∎
Remark 3.5*.*
Note that Theorem 3.2 and Corollary 3.3 also holds for . However, for the existence of minimizers of we require as indicated in the proof of Theorem 3.4.
The following corollary represents a special case of the previous results, which represents the version that is relevant for our numerical computation in Section 5.
Corollary 3.6**.**
Let be a non-empty, open, bounded set with Lipschitz boundary and let . For each let such that as and define the functionals by
[TABLE]
if and otherwise. Moreover, define by
[TABLE]
for , a.e., and otherwise.
Then, for every infinitesimal sequence a minimizer of exists for every . Furthermore, in , and up to a subsequence in with being a minimizer of .
Proof of Corollary 3.6.
We define and, choose the functions , , and in the following way:
[TABLE]
for all and . Note that in this setting we have
[TABLE]
and hence, one can simply verify that Assumption 3.1 is fulfilled with .
From Theorem 3.2 we, therefore, get that -converges to . Since -convergence is preserved under constant multiplication we get the result by multiplying and with . ∎
4. Proof of Theorem 3.2
The proof of Theorem 3.2 follows the usual strategy that has been used for the classical Ambrosio-Tortorelli approximation and various generalizations (see [9, 10, 16, 25, 34, 35]). Firstly, we show the -inequality on the real line (see Proposition 4.1). The generalization to the multi-dimensional case, stated in Proposition 4.2, is then shown by a slicing argument.
The -inequality is shown with the help of the density result, Theorem 2.2. Here, we exploit the fact that the phase field variable is allowed to have jumps, which enables the construction of a much simpler recovery sequence than when the phase field needs to be smooth.
Proposition 4.1**.**
In the setting of Theorem 3.2 with we redefine by
[TABLE]
Then there holds .
Proof.
First of all, for each open set we define the localized functionals
[TABLE]
for all and , and otherwise.
Now, let be a sequence greater than zero with as , and let and be sequences in such that and as . By possibly extracting a subsequence, we can assume that
[TABLE]
Therefore, we must have , and because of to the uniform convergence of to as (see [A2]), we obtain that a.e. on .
We first show that is finite and
[TABLE]
For that let , and let sufficiently small such that . Set and assume that . Furthermore, let and choose such that up to a subsequence, there holds for all . Then there holds
[TABLE]
so that converges weakly to in and consequently . Hence, we must have , and we can find a sequence such that , where is a precise representative of . The assumptions on in [A5] imply as . Since a.e. we can, therefore, find with such that as well as .
With this at hand we get from the -convergence of (see [A1]),
[TABLE]
Defining
[TABLE]
we get, for large enough,
[TABLE]
and together with (2.5)
[TABLE]
Applying Lemma A.2 yields
[TABLE]
By merging (4.2), (4.4) and (4.5) and since as (see [A3]) we deduce
[TABLE]
For every we can repeat the preceding arguments for each element in a set with sufficiently small such that for in order to obtain
[TABLE]
By assumption the right hand side is finite; hence, there must hold and we deduce (4.1).
In the next step we show that for all ,
[TABLE]
Let be an open interval such that . For and we define the intervals
[TABLE]
and we extract a subsequence of (not relabeled) such that exists for all . Moreover, for we define the set
[TABLE]
For every there exists a sequence in and such that
[TABLE]
Thus, analogously to the above it follows that
[TABLE]
for some by assumption.
Repeating this argument for every we get
[TABLE]
Note that in view of [A1] there holds , and hence, is bounded independently of . Because is also non-decreasing with respect to it remains constant for large enough. As a consequence, we can pick with N_{z}\coloneq\max_{k\in\mathbb{N}}\bigl{(}\#T^{k}_{z}\bigr{)}, such that each converges to some as . Define with . Let , choose large enough, and let . Then we have . Therefore,
[TABLE]
From [A5] we have , and thus, we obtain in up to a subsequence, and consequently . By the weak lower semi-continuity of the norm we have
[TABLE]
Since this inequality holds for all , we have , and since with , we deduce that . Taking the limit for results in
[TABLE]
Finally, we take the limit for and obtain (note that is continuous from [A5])
[TABLE]
Since was chosen arbitrarily such that we end up with (4.6). Together with (4.1) we obtain
[TABLE]
and we conclude the proof by taking the limit for . ∎
Proposition 4.2**.**
In the setting of Theorem 3.2 there holds
[TABLE]
Proof.
For the proof we use the usual notation in the setting of slicing, introduced in Section 2.3. In what follows let and , let be open and choose arbitrarily. We define the localized version of (3.1) by
[TABLE]
if and otherwise. Furthermore, we define for open
[TABLE]
if and otherwise. We additionally set
[TABLE]
From Fubini’s theorem and Theorem 2.3 we therefore obtain
[TABLE]
if is absolutely continuous with respect to , and otherwise. Thus, there clearly holds
[TABLE]
From Proposition 4.1 we know that with
[TABLE]
Choosing
[TABLE]
there holds for all sequences and with and in as
[TABLE]
Fatou’s Lemma and (4.8) yield
[TABLE]
Moreover, by construction, is finite if and only if for a.a. there holds a.e. on , as well as
[TABLE]
Since there holds for every and every with for a.a.
[TABLE]
we get by Corollary 2.4 that is finite only if and a.e. in . Hence,
[TABLE]
if and a.e. in , and otherwise.
Since and were chosen arbitrarily, if a.e. in , then [16, Theorem 1.16] and (4.9) imply
[TABLE]
Otherwise, the -inequality follows directly from (4.9) with arbitrary. ∎
The following proposition now shows the -inequality.
Proposition 4.3**.**
In the setting of Theorem 3.2 there holds
[TABLE]
Proof.
If or on some set with non-zero measure the assertion is obvious. We first show that the result holds for replaced by for which 1.–3. in Theorem 2.2 (replacing by ) hold.
For this purpose choose for every some such that as but still as , for instance
[TABLE]
Take some smooth cutoff function with on and on , and define for all . Then, we set for all , and we fix for every the function , for which holds , on and in as . Furthermore we define
[TABLE]
Since is polyhedral there holds . Consequently, we have for all .
With this at hand, recalling [A5], we get
[TABLE]
By the choice of , the fact that and that a.e. on (see [29, Lemma 3.2.34]) we get on
[TABLE]
which implies
[TABLE]
with independent of . The first and the last term obviously converge to 0 as . For the second term we remark that for a polyhedral set, the Hausdorff measure coincides with the Minkowski content (see, e.g., [29, Theorem 3.2.29]), so that
[TABLE]
As a consequence, recalling that we get
[TABLE]
and therefore
[TABLE]
Additionally, (4.11) and as imply
[TABLE]
Furthermore, there holds
[TABLE]
which is again due to being a polyhedral set.
Applying the previous three convergence statements in (4.10) together with the limit behaviour of , and from [A2] and [A3], we get
[TABLE]
If we have for every that with , and we can find a sequence in such that 1.–6. in Theorem 2.2 (replacing by ) holds. Together with the lower semi-continuity of in and (4.12) we deduce
[TABLE]
Obviously, there holds , and from (see Section 2.3) follows that . Thus, using again the lower semi-continuity of we get
[TABLE]
which concludes the proof. ∎
The proof of Theorem 3.2 is now a direct consequence of Proposition 4.2 and Proposition 4.3.
5. Numerical Examples
The aim of this section is to numerically compare our new approximation from Corollary 3.6 with the classical Ambrosio-Tortorelli approach. We aim for a simple and easy to implement algorithm in order to illustrate the differences between those two models and justify our theory. As an application for the numerical computations we choose the image segmentation problem already described in the introduction.
Thus, for being non-empty, open, bounded and with Lipschitz boundary, we seek to minimize the following functional with respect to
[TABLE]
where is the original image and are the parameters influencing the smoothing and segment detection in the solution. They have, of course, to be chosen with care in order to get a sensible result.
Using now Corollary 3.6 we can approximately minimize by minimizing
[TABLE]
for small , which we also refer to as the -model.
On the other hand we consider the elliptic approximation (1.3), introduced in [10]:
[TABLE]
for and , which we refer to as the -model (note that we “redefined” as in the following, we will only use (5.3) such that there is no chance of confusion).
For the discretization of these functionals we consider a 2-dimensional image with its natural pixel grid with pixel length . If the image is given by pixels, we use the discrete grid and we identify the piecewise constant functions as elements in the Euclidean space . Precisely, one sets for , where denotes the characteristic function of , i.e., on and on .
For the discretization of the appearing gradients and the total variation we use a finite difference scheme. For this purpose we define the finite difference operator with zero Neumann boundary condition by
[TABLE]
with
[TABLE]
Furthermore, we denote the adjoint of by , i.e. for the operator is defined by
[TABLE]
where for all
[TABLE]
For functions , operations such as the product (or ), the minimum , the maximum , and the square are always meant to be element-wise. With , and we respectively refer to the Frobenius norm, the -norm of vectorized, and the maximum norm of . The Frobenius inner product of and is written as . For any field , like for , we denote by the Euclidean norm along the first axis, i.e.
[TABLE]
With this strategy we can define the discretized versions of (5.2) and (5.3), respectively, for all by
[TABLE]
and
[TABLE]
The symbol refers to the discretized function that is one almost everywhere. Note that we neglected the factor in the functionals since it does not change their minimum. Moreover, we chose here, because in the discrete setting, the problem of finding a minimizer stays well-posed for this choice.
Remark 5.1*.*
The choice of the recovery sequence in the proof of Proposition 4.3 suggests that the width of the detected contours represented by the phase field variable correlates with the parameter . The precise relation between and the width of the phase field is, however, not known. Examining the structure of the approximating functionals, we expect that it depends, in particular, on the trade-off between the two terms and .
Although, we would like to have the width of the phase field and therefore extremely small, there is a limit of choice depending on the pixel size . To be more precise, choosing depending on , it is well known that -converges as , provided that as (see [15, 12]). We believe that a corresponding statement is also true for the considered -phase field approximation. A study of this is, however, outside the scope of the present paper.
The difficulty in finding a minimizer lies in the non-convex, and for also non-smooth, structure. In previous works an alternating minimization scheme has been commonly used, exploiting the fact that the functionals are convex in each variable separately (see [15, 11, 1]). However, in this work we choose a more recent approach, which is the proximal alternating linearized minimization (in short PALM) presented in [13]. This algorithm is a form of an alternating gradient descent procedure, for which we do not have to solve any linear equation. This makes the algorithm also faster than the alternating minimization scheme, especially for rather large images. Our experience also showed no significant difference in the results.
For the PALM algorithm one uses the fact that the objective functional can be written as . Then, for some initial value we set for each
[TABLE]
where . By we denote the proximal operator with step size :
[TABLE]
For the right choices of the step sizes and above one can show that this scheme converges to a critical point of as (see [13, Proposition 3.1]). Namely, we need to choose and for some , where and are Lipschitz constants of and , respectively. Unfortunately, convergence rates are not known, so that as a stopping criterion, we are limited to measure the change of the variables in each iteration. We stop the scheme when this change drops under a specified threshold or if a certain number of iterations is reached.
We will now have a closer look on how the algorithm looks like for and separately.
-model
We write with
[TABLE]
and
[TABLE]
We have
[TABLE]
Since the operator norm of is strictly below (see, e.g. [21, 19]), we can choose for some
[TABLE]
such that is constant throughout the algorithm.
As a simple computation shows, solving (5.4) is then equivalent to
[TABLE]
By completing squares and ignoring constant terms the problem (5.5) can be equivalently reformulated to
[TABLE]
with . Since the non-smooth term is still present, this minimization can not be solved directly. Instead we tackle the problem with the algorithm introduced by A. Chambolle and T. Pock in [22], solving the corresponding primal-dual problem. Therefore, we define for all and the functions
[TABLE]
such that (5.9) is equivalent to
[TABLE]
Here, is the forward difference operator for .
The corresponding primal-dual saddle point problem is given by
[TABLE]
where denotes the convex conjugate of , i.e., . Clearly, for any solution of (5.11) we have that is a solution of (5.10). We solve (5.11) with [22, Algorithm 1]. Namely, for and for some , as well as we define for all
[TABLE]
Then, [22, Theorem 1] guarantees the convergence of as to a solution of (5.11). For a stopping criterion of the primal-dual iteration we consider the primal-dual gap which is for and given by
[TABLE]
It vanishes if and only if solves (5.11). For this reason, we stop iteration (5.12)–(5.14) if the corresponding primal-dual gap is smaller than a certain tolerance.
We now continue with the precise computations of the primal-dual steps for the -phase field approximation. Since is the indicator function of a convex set, the update step (5.12) is the projection of onto (cf. [22, Section 6.2]). Thus, we simply get
[TABLE]
The proximal operator appearing in (5.13) can be solved directly. Namely, we get
[TABLE]
with , which yields
[TABLE]
The primal-dual gap for and can be computed explicitly. Taking into account that and
[TABLE]
with
[TABLE]
it is given by
[TABLE]
Summing up all the previous computations for our -phase field model, we get Algorithm 1 in the appendix, which is the numerical scheme as implemented.
-model (Ambrosio-Tortorelli)
For the elliptic approximation we use and as in (5.6) and only redefine by
[TABLE]
in order to obtain . Clearly, and can also be chosen as before in (5.7). Hence, (5.4) results again in (5.8). The difference of the algorithm compared to the one for the -phase field appears in (5.5), which is now equivalent to
[TABLE]
Since this problem is sufficiently smooth it could be easily solved directly, by solving a linear system. Nevertheless, for a better comparability and for saving the effort of solving a large linear equation, we stay as close as possible to the algorithm used for the -model. Thus, we use again the primal-dual scheme as in (5.12)–(5.14), where this time we need to choose
[TABLE]
for and
[TABLE]
for . Note, that we have and thus (5.12) yields
[TABLE]
and (5.13) results in
[TABLE]
The primal-dual gap for this approximation is given by
[TABLE]
with
[TABLE]
Altogether, this yields Algorithm 2 in the appendix, which is the numerical scheme that we use for computations.
Numerical Results
With the presented algorithms we perform computations for two different images. For all numerical examples we fix the width of the images to . The pixel size then depends on the number of pixels and is given by .
For the first computation we use the noisy image from Figure 1. The latter is generated by adding Gaussian noise of standard deviation and clipping the result to the original image range . In this computation, the input image corresponds to this noisy image and we only change the approximating variable , in order to investigate its influence, while fixing the other parameters for the algorithms as indicated in Table 1. The result can be observed in Figure 2.
One can clearly see that the -model produces almost binary phase fields, i.e. takes only the values [math] (corresponding to a black pixel) and (corresponding to a white pixel). In other words these phase fields are much sharper than the ones produced by the -model. Moreover, we observe that can be chosen larger when using the -model in order to obtain a result that is comparable to the -model.
Besides the comparison of the two models one can also observe, that in both approximations of the Mumford-Shah functional, only few edges are detected if is too small. Whereas, if is relatively large, the contours become rather wide. These effects are well-known and have already been mentioned in Remark 5.1, from which we also expect that for small values of , the phase field may detect the edges again, when reducing . Also this can be confirmed from Figure 3, where we use the same image but this time with 512 512 pixels keeping the width of the image domain fixed to as above, resulting in the value of being halved.
Figure LABEL:fig:sailing shows another picture with 512 512 pixel size. To the original image we again add Gaussian noise (noise level: ). This noisy image serves as the input data for our algorithms. Besides and , the parameters have a been chosen like in Table 1.
Acknowledgements
This work was supported by the International Research Training Group IGDK 1754 “Optimization and Numerical Analysis for Partial Differential Equations with Nonsmooth Structures”, funded by the German Research Council (DFG) and the Austrian Science Fund (FWF):[W 1244-N18].
Appendix A Auxiliary statements
Lemma A.1**.**
Let be a signed Radon measure on , a proper, convex and lower semi-continuous function and a mollifier, i.e., . Then,
[TABLE]
where denotes the Lebesgue decomposition of and is the recession function of .
Proof.
Fix , and choose as well as . Then, and , such that Jensen’s inequality yields
[TABLE]
Since is either or -almost everywhere, the rightmost integral reads as where I_{+}=\bigl{\{}x\in\mathbb{R}\colon\frac{\mathrm{d}{\mu^{s}}}{\mathrm{d}{\lvert\mu^{s}\rvert}}(x)=1\bigr{\}} and I_{-}=\bigl{\{}x\in\mathbb{R}\colon\frac{\mathrm{d}{\mu^{s}}}{\mathrm{d}{\lvert\mu^{s}\rvert}}(x)=-1\bigr{\}}. Clearly, as , this expression converges to \int_{\mathbb{R}}\psi^{\infty}\bigl{(}\frac{\mathrm{d}{\mu^{s}}}{\mathrm{d}{\lvert\mu^{s}\rvert}}\bigr{)}\,\eta(x-\cdot)\,\mathrm{d}{\lvert\mu^{s}\rvert} (possibly to ). Since , by lower semi-continuity of ,
[TABLE]
Integrating both sides over with respect to and interchanging order on the right-hand side then yields the result. ∎
Lemma A.2**.**
Let be a bounded open interval, and . Then,
[TABLE]
with , and according to [A1], [A2] and [A3], respectively, and for .
Proof.
Denote by and and extend outside of by for and for . Then, with the zero extension of on . Choose a mollifier , and denote by for . Then, each is in and by classical differentiation, the Fenchel inequality and [A3],
[TABLE]
We have in as , so by continuity of and (as a consequence of convexity and finiteness on ), one can conclude that \int_{a}^{b}\varphi_{\varepsilon}\bigl{(}W_{\varepsilon}(v_{\delta})\bigr{)}\,\mathrm{d}{x}\to\int_{a}^{b}\varphi_{\varepsilon}\bigl{(}W_{\varepsilon}(v)\bigr{)}\,\mathrm{d}{x} as . Denoting by for yields a convex function since is increasing on , so applying Lemma A.1 yields
[TABLE]
By continuity of , one can further conclude that in as . The above then implies that as a sequence of is bounded in the space of Radon measures on , yielding a weak*-convergent subsequence. By strong-weak*-closedness of the weak derivative, the limit has to coincide with . This holds for each subsequence, such that in fact, converges weakly* to as . Thus, using weak* lower semi-continuity, we obtain
[TABLE]
which, together with the above, yields the desired estimate. ∎
Appendix B Pseudo Codes
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. Almi and S. Belz. Consistent finite-dimensional approximation of phase-field models of fracture. Ann. Mat. Pura Appl. , Dec. 2018.
- 2[2] S. Almi, S. Belz, S. Micheletti, and S. Perotto. A dimension-reduction model for brittle fractures on thin shells with mesh adaptivity. submitted, ar Xiv:2004.08871 [math.NA], 2020.
- 3[3] S. Almi, S. Belz, and M. Negri. Convergence of discrete and continuous unilatersl flows for Ambrosio-Tortorelli energies and application to mechanics. M 2AN Math. Model. Numer. Anal. , Dec. 2018.
- 4[4] S. Almi and M. Negri. Analysis of staggered evolutions for nonlinear energies in phase field fracture. Archive for Rational Mechanics and Analysis , 236(1):189–252, 2020.
- 5[5] L. Ambrosio. A compactness theorem for a new class of functions of bounded variation. Boll. Un. Mat. Ital. B (7) , 3(4):857–881, 1989.
- 6[6] L. Ambrosio. Existence theory for a new class of variational problems. Arch. Rational Mech. Anal. , 111(4):291–322, 1990.
- 7[7] L. Ambrosio. A new proof of the SBV compactness theorem. Calc. Var. Partial Differential Equations , 3(1):127–137, 1995.
- 8[8] L. Ambrosio, N. Fusco, and D. Pallara. Functions of bounded variation and free discontinuity problems . Oxford Mathematical Monographs. The Clarendon Press, Oxford University Press, New York, 2000.
