Parameter-free shape optimization: various shape updates for engineering   applications

Lars Radtke; Georgios Bletsos; Niklas K\"uhl; Tim Suchan; Thomas Rung,; Alexander D\"uster; Kathrin Welker

arXiv:2302.12100·cs.CE·October 4, 2023

Parameter-free shape optimization: various shape updates for engineering applications

Lars Radtke, Georgios Bletsos, Niklas K\"uhl, Tim Suchan, Thomas Rung,, Alexander D\"uster, Kathrin Welker

PDF

TL;DR

This paper reviews various parameter-free shape optimization methods used in engineering, explaining auxiliary problems for shape updates, and compares their effectiveness through numerical examples in fluid dynamics.

Contribution

It provides a formal explanation of different auxiliary problems for shape updates and compares their performance in engineering applications.

Findings

01

Different auxiliary problems influence shape update effectiveness

02

Numerical examples demonstrate practical differences in CFD applications

03

Parameter-free methods are versatile for complex engineering problems

Abstract

In the last decade, parameter-free approaches to shape optimization problems have matured to a state where they provide a versatile tool for complex engineering applications. However, sensitivity distributions obtained from shape derivatives in this context cannot be directly used as a shape update in gradient-based optimization strategies. Instead, an auxiliary problem has to be solved to obtain a gradient from the sensitivity. While several choices for these auxiliary problems were investigated mathematically, the complexity of the concepts behind their derivation has often prevented their application in engineering. This work aims at an explanation of several approaches to compute shape updates from an engineering perspective. We introduce the corresponding auxiliary problems in a formal way and compare the choices by means of numerical examples. To this end, a test case and…

Tables3

Table 1. Table 1: Cylinder ( Re D = 20 subscript Re D 20 \mathrm{Re}_{\mathrm{D}}=20 ): Results of the mesh dependence study. For illustrative purposes we denote here J ^ ′ = ∫ Γ d s 𝑑 Γ superscript ^ 𝐽 ′ subscript superscript Γ d 𝑠 differential-d Γ \hat{J}^{\prime}=\int_{\Gamma^{\mathrm{d}}}s\,d\Gamma . Index i 𝑖 i refers to the mesh refinement level. Note ρ = 20 𝜌 20 \rho=20 kg / m 3 kg superscript m 3 \mathrm{kg/m^{3}} , μ = 1 𝜇 1 \mu=1 Pa ⋅ s ⋅ Pa s \mathrm{Pa\cdot s} , U i n = 1 subscript 𝑈 𝑖 𝑛 1 U_{in}=1 m / s m s \mathrm{m/s} and D = 1 𝐷 1 D=1 m m \mathrm{m} .

refinement level	number of FV	$\frac{2 J_{i}}{ρ U_{i n}^{2} D^{2}}$	$\frac{2 {\hat{J}}_{i}^{'}}{ρ U_{i n}^{2} D}$	$\frac{J_{i} - J_{i - 1}}{J_{i - 1}} (%)$	$\frac{{\hat{J}}_{i}^{'} - {\hat{J}}_{i - 1}^{'}}{{\hat{J}}_{i - 1}^{'}} (%)$
M0	300	2.1197	-3.325	-	-
M1	1200	2.1433	-3.612	1.11	-8.64
M2	4800	2.1356	-3.822	-0.35	-5.81
M3	19200	2.1334	-3.937	-0.11	-3.01
M4	76800	2.1334	-3.932	-0.003	0.14
M5	307200	2.1334	-3.936	-0.001	-0.11

Table 2. Table 2: Double-bent pipe ( Re D = 500 subscript Re D 500 \mathrm{Re}_{\mathrm{D}}=500 ): Results of the mesh dependence study. For illustrative purposes we denote here J ^ ′ = ∫ Γ d s 𝑑 Γ superscript ^ 𝐽 ′ subscript superscript Γ d 𝑠 differential-d Γ \hat{J}^{\prime}=\int_{\Gamma^{\mathrm{d}}}s\,d\Gamma . Index i 𝑖 i refers to the mesh refinement level. Note ρ = 500 𝜌 500 \rho=500 kg / m 3 kg superscript m 3 \mathrm{kg/m^{3}} , μ = 1 𝜇 1 \mu=1 Pa ⋅ s ⋅ Pa s \mathrm{Pa\cdot s} , U = 1 𝑈 1 U=1 m / s m s \mathrm{m/s} and D = 1 𝐷 1 D=1 m m \mathrm{m} .

refinement level	number of FV	$\frac{2 J_{i}}{ρ U^{3} D^{2}}$	$\frac{2 {\hat{J}}_{i}^{'}}{ρ U^{3} D}$	$\frac{J_{i} - J_{i - 1}}{J_{i - 1}} (%)$	$\frac{{\hat{J}}_{i}^{'} - {\hat{J}}_{i - 1}^{'}}{{\hat{J}}_{i - 1}^{'}} (%)$
M0	11250	2.18	-5.55	-	-
M1	90000	3.091	-11.44	41.73	106.13
M2	720000	3.15	-11.38	1.91	-0.53
M3	5760000	3.17	-11.38	0.41	0.0

Table 3. Table 3: Double-bent pipe ( Re D = 500 subscript Re D 500 \mathrm{Re}_{\mathrm{D}}=500 ): Measured computation time CPUh ( n opt ⋅ t wc ¯ ⋅ n CPU ⋅ superscript 𝑛 opt ¯ superscript 𝑡 wc superscript 𝑛 CPU n^{\mathrm{opt}}\cdot\overline{t^{\mathrm{wc}}}\cdot n^{\mathrm{CPU}} ) for all five optimization studies, where t wc ¯ ¯ superscript 𝑡 wc \overline{t^{\mathrm{wc}}} refers to the mean wall clock time per primal/adjoint run and n opt superscript 𝑛 opt n^{\mathrm{opt}} as well as n CPU superscript 𝑛 CPU n^{\mathrm{CPU}} denote the number performed optimization steps as well as employed CPU cores.

approach	$n^{opt}$ [-]	primal $\bar{t^{wc}}$ [h]	adjoint $\bar{t^{wc}}$ [h]	total CPUh [h]
DS	42	0.1325	0.1176	10.5042
SLB ( $A / D = 1$ )	241	0.1005	0.0994	48.1759
VLB ( $A / D = 1$ )	235	0.0991	0.0981	46.342
SP-WD	441	0.1255	0.1109	86.9652
PHD ( $p = 4$ )	491	0.1914	0.1070	146.5144

Equations110

L (γ) = \int_{0}^{1} g_{m} (\overset{γ}{˙} (t), \overset{γ}{˙} (t)) d t

L (γ) = \int_{0}^{1} g_{m} (\overset{γ}{˙} (t), \overset{γ}{˙} (t)) d t

d (m_{1}, m_{2}) = γ in f L (γ), with γ (0) = m_{1} and γ (1) = m_{2} .

d (m_{1}, m_{2}) = γ in f L (γ), with γ (0) = m_{1} and γ (1) = m_{2} .

g_{Γ^{i}} (\nabla J (Γ^{i}), v^{Γ}) = (J_{*})_{Γ^{i}} (v^{Γ}) \forall v^{Γ} \in T_{Γ^{i}} (M) \frac{\partial J}{\partial x} \frac{\partial J}{\partial x}_{x^{i}}

g_{Γ^{i}} (\nabla J (Γ^{i}), v^{Γ}) = (J_{*})_{Γ^{i}} (v^{Γ}) \forall v^{Γ} \in T_{Γ^{i}} (M) \frac{\partial J}{\partial x} \frac{\partial J}{\partial x}_{x^{i}}

\nabla J (x^{i}) = \frac{\partial J}{\partial x}_{x^{i}}

\nabla J (x^{i}) = \frac{\partial J}{\partial x}_{x^{i}}

g_{Γ} (\nabla J (Γ), v^{Γ}) = (J_{*})_{Γ} (v^{Γ}) \forall v^{Γ} \in T_{Γ} M .

g_{Γ} (\nabla J (Γ), v^{Γ}) = (J_{*})_{Γ} (v^{Γ}) \forall v^{Γ} \in T_{Γ} M .

g_{Γ} (u^{Γ}, v^{Γ}) = J^{'} (Γ) (v^{Γ}) \forall v^{Γ} \in T_{Γ} M .

g_{Γ} (u^{Γ}, v^{Γ}) = J^{'} (Γ) (v^{Γ}) \forall v^{Γ} \in T_{Γ} M .

D J (Γ) (v^{Γ}) = J^{'} (Γ) (v^{Γ}) = t \to 0^{+} lim \frac{J ( Γ _{t} ) - J ( Γ )}{t} .

D J (Γ) (v^{Γ}) = J^{'} (Γ) (v^{Γ}) = t \to 0^{+} lim \frac{J ( Γ _{t} ) - J ( Γ )}{t} .

J^{'} (Γ) (v^{Γ}) = \int_{Γ} v^{Γ} \cdot n s (x) d Γ,

J^{'} (Γ) (v^{Γ}) = \int_{Γ} v^{Γ} \cdot n s (x) d Γ,

R_{Γ^{i}} : T_{Γ^{i}} (M) \to M, v^{Γ} \mapsto R_{Γ^{i}} (v^{Γ}) = Γ^{i} + v^{Γ},

R_{Γ^{i}} : T_{Γ^{i}} (M) \to M, v^{Γ} \mapsto R_{Γ^{i}} (v^{Γ}) = Γ^{i} + v^{Γ},

g_{Γ} : T_{Γ} (B_{e}) \times T_{Γ} (B_{e}), (u^{Γ}, v^{Γ}) \mapsto \int_{Γ} u^{Γ} \cdot v^{Γ} d Γ

g_{Γ} : T_{Γ} (B_{e}) \times T_{Γ} (B_{e}), (u^{Γ}, v^{Γ}) \mapsto \int_{Γ} u^{Γ} \cdot v^{Γ} d Γ

g_{Γ} : T_{Γ} (B_{e}) \times T_{Γ} (B_{e}), (u^{Γ}, v^{Γ}) \mapsto \int_{Γ} Φ u^{Γ} \cdot v^{Γ} d Γ

g_{Γ} : T_{Γ} (B_{e}) \times T_{Γ} (B_{e}), (u^{Γ}, v^{Γ}) \mapsto \int_{Γ} Φ u^{Γ} \cdot v^{Γ} d Γ

g_{Γ} : T_{Γ} (B_{e}) \times T_{Γ} (B_{e}), (u^{Γ}, v^{Γ}) \mapsto \int_{Γ} u^{Γ} \cdot v^{Γ} + A \nabla_{Γ} u^{Γ} \cdot \nabla_{Γ} v^{Γ} d Γ

g_{Γ} : T_{Γ} (B_{e}) \times T_{Γ} (B_{e}), (u^{Γ}, v^{Γ}) \mapsto \int_{Γ} u^{Γ} \cdot v^{Γ} + A \nabla_{Γ} u^{Γ} \cdot \nabla_{Γ} v^{Γ} d Γ

g_{Γ} (u^{Γ}, v^{Γ}) := \int_{Γ} u^{Γ} \cdot v^{Γ} - A Δ_{Γ} u^{Γ} \cdot v^{Γ} d Γ,

g_{Γ} (u^{Γ}, v^{Γ}) := \int_{Γ} u^{Γ} \cdot v^{Γ} - A Δ_{Γ} u^{Γ} \cdot v^{Γ} d Γ,

g_{Γ} : T_{Γ} (B_{e}) \times T_{Γ} (B_{e}), (u^{Γ}, v^{Γ}) \mapsto \int_{Γ} Φ (u^{Γ} \cdot v^{Γ} + A \nabla_{Γ} u^{Γ} \cdot \nabla_{Γ} v^{Γ}) d Γ,

g_{Γ} : T_{Γ} (B_{e}) \times T_{Γ} (B_{e}), (u^{Γ}, v^{Γ}) \mapsto \int_{Γ} Φ (u^{Γ} \cdot v^{Γ} + A \nabla_{Γ} u^{Γ} \cdot \nabla_{Γ} v^{Γ}) d Γ,

g_{Γ} (u^{Γ}, v^{Γ}) := \int_{Γ} Φ (u^{Γ} \cdot v^{Γ} - A Δ_{Γ} u^{Γ} \cdot v^{Γ}) d Γ.

g_{Γ} (u^{Γ}, v^{Γ}) := \int_{Γ} Φ (u^{Γ} \cdot v^{Γ} - A Δ_{Γ} u^{Γ} \cdot v^{Γ}) d Γ.

g_{s} (u^{Γ}, v^{Γ}) = J^{'} (Γ) (v^{Γ}) = a (u, v) \forall v \in V (Ω) .

g_{s} (u^{Γ}, v^{Γ}) = J^{'} (Γ) (v^{Γ}) = a (u, v) \forall v \in V (Ω) .

a (u, v) = \int_{Ω} \nabla u \cdot \nabla v d Ω or a (u, v) = \int_{Ω} \nabla u \cdot D \nabla v d Ω,

a (u, v) = \int_{Ω} \nabla u \cdot \nabla v d Ω or a (u, v) = \int_{Ω} \nabla u \cdot D \nabla v d Ω,

- \frac{\nabla J ( Γ )}{∥\nabla J ( Γ ) ∥ _{g_{Γ}}} = u^{Γ} \in T_{Γ} (M) : ∥ u^{Γ} ∥_{g_{Γ}} = 1 arg min J^{'} (Γ) (u^{Γ}) .

- \frac{\nabla J ( Γ )}{∥\nabla J ( Γ ) ∥ _{g_{Γ}}} = u^{Γ} \in T_{Γ} (M) : ∥ u^{Γ} ∥_{g_{Γ}} = 1 arg min J^{'} (Γ) (u^{Γ}) .

u \in W^{1, p} (Ω, R^{d}) min \int_{Ω} \frac{1}{p} ∣ \nabla u ∣^{p} d Ω + J^{'} (Γ) (u^{Γ})

u \in W^{1, p} (Ω, R^{d}) min \int_{Ω} \frac{1}{p} ∣ \nabla u ∣^{p} d Ω + J^{'} (Γ) (u^{Γ})

a (u, v) \int_{Ω} ∣ \nabla u ∣^{p - 2} (\nabla u \cdot \nabla v) d Ω = J^{'} (Γ) (v^{Γ}) \forall v \in W^{1, p} (Ω, R^{d}),

a (u, v) \int_{Ω} ∣ \nabla u ∣^{p - 2} (\nabla u \cdot \nabla v) d Ω = J^{'} (Γ) (v^{Γ}) \forall v \in W^{1, p} (Ω, R^{d}),

Ω^{i + 1} = {\tilde{x} : \tilde{x} = x + α θ (x) \forall x \in Ω^{i}},

Ω^{i + 1} = {\tilde{x} : \tilde{x} = x + α θ (x) \forall x \in Ω^{i}},

J (Γ) = \int_{Ω} j_{Ω} d Ω + \int_{Γ} j_{Γ} d Γ,

J (Γ) = \int_{Ω} j_{Ω} d Ω + \int_{Γ} j_{Γ} d Γ,

J (Γ^{i + 1}) \approx J (Γ^{i}) + α J^{'} (Γ^{i}) (θ^{Γ}) .

J (Γ^{i + 1}) \approx J (Γ^{i}) + α J^{'} (Γ^{i}) (θ^{Γ}) .

J (Γ^{i + 1}) \approx J (Γ^{i}) - α \int_{Γ} s^{2} d Γ ⪅ J (Γ^{i}),

J (Γ^{i + 1}) \approx J (Γ^{i}) - α \int_{Γ} s^{2} d Γ ⪅ J (Γ^{i}),

J (Γ^{i + 1}) \approx J (Γ^{i}) + α g_{Γ} (u^{Γ}, θ^{Γ}) .

J (Γ^{i + 1}) \approx J (Γ^{i}) + α g_{Γ} (u^{Γ}, θ^{Γ}) .

J (Γ^{i + 1}) \approx J (Γ^{i}) - α g_{Γ} (θ^{Γ}, θ^{Γ}),

J (Γ^{i + 1}) \approx J (Γ^{i}) - α g_{Γ} (θ^{Γ}, θ^{Γ}),

θ_{n}^{Γ} = θ^{Γ} (x_{n}) = - j \in N_{n} \sum w_{n, j} s_{j} n_{j} .

θ_{n}^{Γ} = θ^{Γ} (x_{n}) = - j \in N_{n} \sum w_{n, j} s_{j} n_{j} .

n_{n} = \frac{1}{2} (n^{e_{1}} + n^{e_{2}}) .

n_{n} = \frac{1}{2} (n^{e_{1}} + n^{e_{2}}) .

Find u^{Γ}, s.t. \int_{Γ^{d}} A \nabla_{Γ} u^{Γ} \cdot \nabla_{Γ} v^{Γ} + u^{Γ} \cdot v^{Γ} d Γ^{d} = J^{'} (Ω) (v^{Γ}) = \int_{Γ^{d}} n \cdot v^{Γ} s d Γ \forall v^{Γ} \in V (Γ^{d}) .

Find u^{Γ}, s.t. \int_{Γ^{d}} A \nabla_{Γ} u^{Γ} \cdot \nabla_{Γ} v^{Γ} + u^{Γ} \cdot v^{Γ} d Γ^{d} = J^{'} (Ω) (v^{Γ}) = \int_{Γ^{d}} n \cdot v^{Γ} s d Γ \forall v^{Γ} \in V (Γ^{d}) .

u^{Γ} - A Δ_{Γ} u^{Γ}

u^{Γ} - A Δ_{Γ} u^{Γ}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Parameter-free shape optimization: various

shape updates for engineering applications

Lars Radtke

Institute for Ship Structural Design and Analysis (M-10)

Numerical Structural Analysis with Appl. in Ship Technology

Hamburg University of Technology

Hamburg, 21073

[email protected]

& Georgios Bletsos

Institute for Fluid Dynamics and Ship Theory (M-8)

Hamburg University of Technology

Hamburg, 21073

[email protected]

& Niklas Kühl

Institute for Fluid Dynamics and Ship Theory (M-8)

Hamburg University of Technology

Hamburg, 21073

[email protected]

& Tim Suchan

Faculty for Mechanical and Civil Engineering

Helmut Schmidt University

Hamburg, 22008

[email protected]

& Thomas Rung

Institute for Fluid Dynamics and Ship Theory (M-8)

Hamburg University of Technology

Hamburg, 21073

[email protected]

& Alexander Düster

Institute for Ship Structural Design and Analysis (M-10)

Num. Struct. Analysis with Appl. in Ship Technology

Hamburg University of Technology

Hamburg, 21073

[email protected]

& Kathrin Welker

Institute of Numerical Mathematics and Optimization

Technische Universität Bergakademie Freiberg

Freiberg, 09599

[email protected]

Abstract

In the last decade, parameter-free approaches to shape optimization problems have matured to a state where they provide a versatile tool for complex engineering applications. However, sensitivity distributions obtained from shape derivatives in this context cannot be directly used as a shape update in gradient-based optimization strategies. Instead, an auxiliary problem has to be solved to obtain a gradient from the sensitivity. While several choices for these auxiliary problems were investigated mathematically, the complexity of the concepts behind their derivation has often prevented their application in engineering. This work aims at an explanation of several approaches to compute shape updates from an engineering perspective. We introduce the corresponding auxiliary problems in a formal way and compare the choices by means of numerical examples. To this end, a test case and exemplary applications from computational fluid dynamics are considered.

Keywords shape optimization $\cdot$ shape gradient $\cdot$ steepest descent $\cdot$ continuous adjoint method $\cdot$ computational fluid dynamics

1 Introduction

Shape optimization is a broad topic with many applications and a large variety of methods. We focus on optimization methods designed to solve optimization problems that are constrained by partial differential equations (PDE). These arise, for example, in many fields of engineering such as fluid mechanics [90, 71, 60], structural mechanics [3, 94] and acoustics [79, 43].

In order to solve computationally a PDE constraint of an optimization problem, the domain under investigation needs to be discretized, i.e., a computational mesh is required. In this paper, we are particularly concerned with boundary-fitted meshes and methods, where shape updates are realized through updates of the mesh. In the context of boundary-fitted meshes, solution methods for shape optimization problems may be loosely divided into parameterized and parameter-free approaches. With parameterized, we denote methods that apply a finite dimensional description of the geometry, which is prescribed beforehand and is part of the derivation process of suitable shape updates, see e.g. [72]. With parameter-free, we denote methods that are derived on the continuous level independently of a parameterization. Of course, in an application scenario, also parameter-free approaches finally discretize the shape using the mesh needed for the solution of the PDE.

In general, optimization methods for PDE-constrained problems aim at the minimization (or maximization, respectively) of an objective functional that depends on the solution (also called the state) of the PDE, e.g. the compliance of an elastic structure [3] or dissipated power in a viscous flow [71]. Since a maximization problem can be expressed as a minimization problem by considering the negative objective functional, we only consider minimization problems in this paper. An in-depth introduction is given in [38]. In this paper, we are concerned with iterative methods that generate shape updates such that the objective functional is reduced. In order to determine suitable shape updates, the so-called shape derivative of the objective functional is utilized. Typically, adjoint methods are used to compute shape derivatives, when the number of design variables is high. This is the case in particular for parameter-free shape optimization approaches, where shapes are not explicitly parameterized, e.g. by splines and after a final discretization, the number of design variables typically corresponds to the number of nodes in the mesh that is used to solve the constraining PDE. Adjoint methods are favorable in this scenario, because their computational cost to obtain the shape derivative is independent of the number of design variables. Only a single additional problem, the adjoint problem, needs to be derived and solved to obtain the shape derivative. For a general introduction to the adjoint method, we refer to [25, 32]. In the continuous adjoint method, the shape derivative is usually obtained as an integral expression over the design boundary identified with the shape and gives rise to a scalar distribution over the boundary, the sensitivity distribution, which is expressed in terms of the solution of the adjoint problem. As an alternative to the continuous adjoint method, the discrete adjoint method may be employed. It directly provides sensitivities at discrete points, likely nodes of the computational mesh. A summary of the continuous and the discrete adjoint approach is given in [31].

Especially in combination with continuous adjoint approaches, it is not common to use the derived expression for the sensitivity directly as a shape update within the optimization loop. Instead, sensitivities are usually smoothed or filtered [16]. A focus of this work lies on the explanation of several approaches to achieve this in such a way that they can be readily applied in the context of engineering applications. To this end, we concentrate on questions like How to apply an approach? and What are the benefits and costs? rather than How can approaches of this type be derived?

Nevertheless, we would like to point out that there is a large amount of literature concerned with the mathematical foundation of shape optimization. For a deeper introduction, one may consult standard text books such as [20, 89]. More recently, an in-depth overview on state-of-the-art concepts has been given in [2] including many references. We include Sobolev gradients into our studies, which can be seen as a well-established concept that is applied in many studies to obtain a so-called descent direction (which leads to the shape update) from a shape derivative, see e.g. [48, 15] for engineering and [82, 98, 99] for mathematical studies. We also look at more recently-developed approaches like the Steklov-Poincaré approach developed in [81] and further investigated in [83, 99] and the $p$ -harmonic descent approach, which was proposed in [19] and further investigated in [69]. In addition, we address discrete filtering approaches as used e.g. in [91, 16] into our studies.

The considered shape updates have to perform well in terms of mesh distortion. Over the course of the optimization algorithm, the mesh has to be updated several times, including the position of the nodes in the domain interior. The deterioration of mesh quality especially if large steps are taken in a given direction is a severe issue that is the subject of several works, see e.g. [70, 91] and plays a major role in the present study as well. Using an illustrative example and an application from computational fluid dynamics (CFD), the different approaches are compared and investigated. However, we do not extensively discuss the derivation of the respective adjoint problem or the numerical solution of the primal and the adjoint problem but refer to the available literature on this topic, see e.g. [68, 71, 37, 78, 90, 96, 97]. Instead, we focus on an investigation of the performances of the different approaches to compute a suitable shape update from a given sensitivity.

The remainder of this paper is structured as follows. In Sec. 2, we explain the shape optimization approaches from a mathematical perspective and provide some glimpses on the mathematical concepts behind the approaches. This includes an introduction to the concept of shape spaces, and the definition of metrics on tangent spaces that lead to the well-known Hilbertian approaches or Sobolev gradients. These concepts are then applied in Sec. 3 to formulate shape updates that reduce an objective functional. In Sec. 4, we apply the various approaches to obtain shape updates in the scope of an illustrative example, which is not constrained by a PDE. This outlines the different properties of the approaches, e.g. their convergence behavior under mesh refinement. In Sec. 5 a PDE-constrained optimization problem is considered. In particular, the energy dissipation for a laminar flow around a two-dimensional obstacle and in a three-dimensional duct is minimized. The different approaches to compute a shape update are investigated and compared in terms of applicability in the sense of being able to yield good mesh qualities and efficiency in the sense of yielding fast convergence.

2 Shape spaces, metrics and gradients

This section focuses on the mathematical background behind parameter-free shape optimization and aims at introducing the required terminology and definitions for Sec. 3, which is aimed more at straightforward application. However, we will reference back to the mathematical section several times, since some information in Sec. 3 may be difficult to understand without the mathematical background. In general, we follow the explanations in [21, 1], to which we also refer for further reading, and for application to shape optimization, we refer to [74, 98, 2].

2.1 Definition of shapes

To enable a theoretical investigation of gradient descent algorithms, we first need to define what we describe as a shape. There are multiple options, e.g. the usage of landmark vectors [18, 36, 44, 73, 88], plane curves [65, 66, 64, 67] or surfaces [9, 10, 45, 54, 63] in higher dimensions, boundary contours of objects [27, 59, 75], multiphase objects [100], characteristic functions of measurable sets [103] and morphologies of images [22]. For our investigations in a two-dimensional setting, we will describe the shape as a plane curve embedded in the surrounding two-dimensional space, the so-called hold-all domain $D\subset\mathbb{R}^{2}$ similar to [29], and for three-dimensional models, we use a two-dimensional surface embedded in the surrounding three-dimensional space $D\subset\mathbb{R}^{3}$ . Additionally, we need the definition of a Lipschitz shape, which is a curve embedded in $\mathbb{R}^{2}$ or a surface embedded in $\mathbb{R}^{3}$ that can be described by (a graph of) a Lipschitz-continuous function. Furthermore, we define a Lipschitz domain as a domain that has a Lipschitz shape as boundary. The concept of smoothness of shapes in two dimensions is sketched in Fig. 1.

2.2 The concept of shape spaces

The definition of a shape space, i.e. a space of all possible shapes, is required for theoretical investigations of shape optimization. Since we focus on gradient descent algorithms, the possibility to use these algorithms requires the existence of gradients. Gradients are trivially computed in Euclidean space (e.g. $\mathbb{R}^{d}$ , $d\in\mathbb{N}$ ), however shape spaces usually do not have a vector space structure. Determining what type of structure a shape space inherits is usually a challenging task and therefore exceeds this paper, however it is common that a shape space does not have a vector space structure. Instead, the next-best option is to aim for a manifold structure with an associated Riemannian metric, a so-called Riemannian manifold.

A finite-dimensional manifold is a topological space and additionally fulfills the three conditions.

It locally can be described by an Euclidean space. 2. 2.

It can be described completely by countably many subsets (second axiom of countability). 3. 3.

Different points in the space have different neighborhoods (Hausdorff space).

If the subsets, so-called charts, are compatible, i.e. there are differentiable transitions between charts, then the manifold is a differentiable manifold and allows the definition of tangent spaces and directions, which are paramount for further analysis in the field of shape optimization. The tangent space at a point on the manifold is a space tangential to the manifold and describes all directions in which the point could move. It is of the same dimension as the manifold. If the transition between charts is infinitely smooth, then we call the manifold a smooth manifold.

Extending the previous definition of a finite-dimensional manifold into infinite dimensions while dropping the second axiom of countability and Hausdorff yields infinite-dimensional manifolds. A brief introduction and overview about concepts for infinite-dimensional manifolds is given in [98, Section 2.3] and the references therein.

In case a manifold structure cannot be established for the shape space in question, an alternative option is a diffeological space structure. These describe a generalization of manifolds, i.e. any previously-mentioned manifold is also a diffeological space. Here, the subsets to completely parametrize the space are called plots. As explained in [41], these plots do not necessarily have to be of the same dimension as the underlying diffeological space, and the mappings between plots do not necessarily have to be reversible. In contrast to shape spaces as Riemannian manifolds, research for diffeological spaces as shape spaces has just begun, see e.g. [34, 99]. Therefore, for the following section we will focus on Riemannian manifolds first, and then briefly consider diffeological spaces.

2.3 Metrics on shape spaces

In order to define distances and angles on the shape space a metric on the shape space is required. Distances between iterates (in our setting, shapes) are necessary, e.g. to state convergence properties or to formulate appropriate stopping criteria of optimization algorithms. For all points $m$ on the manifold $M$ , a Riemannian metric defines a positive definite inner product $g_{m}(\cdot,\cdot)$ on the tangent space $T_{m}(M)$ at each $m\in M$ .111If the inner product is not positive definite but at least non-degenerate as defined in e.g. [56, Def. 8.6], then we call the metric a pseudo-Riemannian metric. This yields a family of inner products such that we have a positive definite inner product available at any point of the manifold. Additionally, it also defines a norm on the tangent space at $m$ as $\|\cdot\|_{g_{m}}=\sqrt{g_{m}(\cdot,\cdot)}$ . If such a Riemannian metric exists, then we call the differentiable manifold a Riemannian manifold, often denoted as $(M,g)$ .

Different types of metrics on shape spaces can be identified, e.g. inner metrics [9, 10, 66], outer metrics [11, 17, 33, 44, 66], metamorphosis metrics [40, 93], the Wasserstein or Monge-Kantorovic metric for probability measures [4, 12, 13], the Weil-Peterson metric [55, 85], current metrics [23, 24, 95] and metrics based on elastic deformations [27, 76].

Additional to the Riemannian metric, we also need a definition of distance to obtain a metric in the classical sense. Following [1, 74, 98], to obtain an expression for distances on the manifold, we first define the length of a differentiable curve $\gamma$ on the manifold starting at $m$ using the Riemannian metric $g_{m}(\cdot,\cdot)$ as

[TABLE]

and then define the distance function $d(m_{1},m_{2})$ as the infimum of any curve length which starts at $m_{1}$ and ends at $m_{2}$ , i.e.

[TABLE]

This distance function is called the Riemannian distance or geodesic distance, since the so-called geodesic describes the shortest distance between two points on the manifold. For more details about geodesics, we refer to [57].

If one were able to obtain the geodesic, then a local mapping from the tangent space to the manifold would already be available: the so-called exponential map. However, finding the exponential map requires the solution of a second-order ordinary differential equation. This is often prohibitively expensive or inaccurate using numerical schemes. The exponential map is a specific retraction (cf. e.g. [1, 74, 98]), but different retractions can also be used to locally map an element of the tangent space back to the manifold. A retraction is a mapping from $T_{m}(M)\rightarrow M$ which fulfills the following two conditions.

The zero-element of the tangent space at $m$ gets mapped to $m$ itself, i.e. $\mathcal{R}_{m}(0)=m$ . 2. 2.

The tangent vector $\dot{\gamma}(t)$ of a curve $\gamma:t\mapsto\mathcal{R}_{m}(t\,\xi)$ starting at $m$ satisfies $\dot{\gamma}(0)=\xi$ . Figuratively speaking, this means that a movement along the curve $\gamma$ is described by a movement in the direction $\bm{\xi}$ while being constrained to the manifold $M$ .

Example

To illustrate the previous point, we would like to introduce a relatively simple example. Let us assume we have a sphere without interior (a two-dimensional surface) embedded in $\mathbb{R}^{3}$ as illustrated in Fig. 2. This sphere represents a manifold $M$ . Additionally, let us take two arbitrary points $m_{1}$ and $m_{2}$ on the sphere. The shortest distance of these two points while remaining on the sphere is not trivial to compute. If one were to use that the sphere is embedded in $\mathbb{R}^{3}$ then the shortest distance of these two points can be computed by subtracting the position vector of both points and is depicted by the red dashed line. However, this path does not stay on the sphere, but instead goes through it. In consideration of the above concepts, the shortest distance between two points on the manifold is given by the geodesic, indicated by a solid red line. Similarly, obtaining the shortest distance along earth’s surface suffers from the same issue. Here, using the straight path through the earth is not an option (for obvious reasons). In a local vicinity around point $m_{1}$ it is sufficient to move on the tangential space $T_{m_{1}}(M)$ at point $m_{1}$ and project back to the manifold using the exponential map to calculate the shortest distance to point $m_{2}$ . However, at larger distances, this may not be a valid approximation anymore.

Several difficulties arise when trying to transfer the previous concepts to infinite-dimensional manifolds. As described in [30], most Riemannian metrics are only weak, i.e. lack an invertible mapping between tangent and cotangent spaces, which is required for inner products.222We do not go into more detail about this issue, the interested reader is referred to [8] for more information on this topic. Further, the geodesic may not exist or is not unique, or the distance between two different elements of the infinite-dimensional manifold may be [math] (the so-called vanishing geodesic distance phenomenon). Thus, even though a family of inner products is a Riemannian metric on a finite-dimensional differentiable manifold, it may not be a Riemannian metric on an infinite-dimensional manifold. Due to these challenges, infinite-dimensional manifolds as shape spaces are still subject of ongoing research.

Metrics for diffeological spaces have been researched to a lesser extent. However most concepts can be transferred, and in [34] a Riemannian metric is defined for a diffeological space, which yields a Riemannian diffeological space. Additionally, the Riemannian gradient and a steepest descent method on diffeological spaces are defined, assuming a Riemannian metric is available. To enable usage of diffeological spaces in an engineering context, further research is required in this field.

2.4 Riemannian shape gradients

The previous sections were kept relatively general and tried to explain the concept of manifolds and metrics on manifolds. Now we focus specifically on shape optimization based on Riemannian manifolds. Following [98], we introduce an objective functional which is dependent on a shape333We use the description of a shape as an element of the manifold and as a $d-1$ -dimensional subset of the hold-all domain $D\subset\mathbb{R}^{d}$ interchangeably. $\Gamma\in M$ , where $M$ denotes the shape space, in this case a Riemannian manifold. In shape optimization, it is often also called shape functional and reads $J\colon M\rightarrow\mathbb{R},\,\Gamma\mapsto J(\Gamma)$ . Furthermore, we denote the perturbation of the shape $\Gamma$ as $\Gamma_{t}=F_{t}(\Gamma)=\{F_{t}(\bm{x}):\bm{x}\in\Gamma\}$ with $t\geq 0$ . The two most common approaches for $F_{t}$ are the velocity method and the perturbation of identity. The velocity method or speed method requires the solution of an initial value problem as described in [89], while the perturbation of identity is defined by $F_{t}(\bm{x})=\bm{x}+t\,\bm{v}^{\Gamma}(\bm{x})$ , $\bm{x}\in\Gamma$ , with a sufficiently smooth vector field $\bm{v}^{\Gamma}$ on $\Gamma$ . We focus on the perturbation of identity for this publication. Reciting Sec. 2.1 a shape is described as a plane curve in two or as a surface in three dimensional surrounding space here, which means they are always embedded in the hold-all domain $D$ .

To minimize the shape functional, i.e. $\min_{\Gamma\in M}J(\Gamma)$ , we are interested in performing an optimization based on gradients. In general, the concept of a gradient can be generalized to Riemannian (shape) manifolds, but some differences between a standard gradient descent method and a gradient descent method on Riemannian manifolds exist. For comparison, we show a gradient descent method on $\mathbb{R}^{d}$ , $d\in\mathbb{N}$ and on Riemannian manifolds in Algorithms 2 and 1, respectively, for which we introduce the required elements in the following.

On Euclidean spaces, an analytic or numerical differentiation suffices to calculate gradients. In contrast, we consider a Riemannian manifold $(M,g)$ now, where the pushforward is required in order to determine the Riemannian (shape) gradient of $J$ . We use the definition of the pushforward from [57, p. 28] and [58, p. 56], which has been adapted to shape optimization in e.g. [29]. The pushforward $(J_{*})_{\Gamma}$ describes a mapping between the tangent spaces $T_{\Gamma}(M)$ and $T_{J(\Gamma)}(\mathbb{R})$ . Using the pushforward, the Riemannian (shape) gradient $\nabla J(\Gamma)$ of a (shape) differentiable function $J$ at $\Gamma\in M$ is then defined as

[TABLE]

Further details about the pushforward can be found in e.g. [46, 57].

As is obvious from the computation of the gradient in Algorithm 1 in line 4 $\rightarrow$ Eq. (3), the Riemannian shape gradient lives on the tangent space at $\Gamma$ , which (in contrast to the gradient for Euclidean space) is not directly compatible with the shape $\Gamma$ . A movement on this tangent space will lead to leaving the manifold, unless a projection back to the manifold is performed by the usage of a retraction as in line 10 of the algorithm and previously described in Sec. 2.3.

In practical applications the pushforward is often replaced by the so-called shape derivative. A shape update direction $\bm{u}^{\Gamma}$ of a (shape) differentiable function $J$ at $\Gamma\in M$ is computed by solving

[TABLE]

The term $J^{\prime}(\Gamma)(\bm{v}^{\Gamma})$ describes the shape derivative of $J$ at $\Gamma$ in the direction of $\bm{v}^{\Gamma}$ . The shape derivative is defined by the so-called Eulerian derivative. The Eulerian derivative of a functional $J$ at $\Gamma$ in a sufficiently smooth direction $\bm{v}^{\Gamma}$ is given by

[TABLE]

If the Eulerian derivative exists for all directions $\bm{v}^{\Gamma}$ and if the mapping $\bm{v}^{\Gamma}\mapsto J^{\prime}(\Gamma)(\bm{v}^{\Gamma})$ is linear and continuous, then we call the expression $J^{\prime}(\Gamma)(\bm{v}^{\Gamma})$ the shape derivative of $J$ at $\Gamma$ in the direction $\bm{v}^{\Gamma}$ .

In general, a shape derivative depends only on the displacement of the shape $\Gamma$ in the direction of its local normal $\bm{n}$ such that it can be expressed as

[TABLE]

the so-called Hadamard form or strong formulation, where $s$ is called sensitivity distribution here. The existence of such a scalar distribution $s$ is the outcome of the well-known Hadamard theorem, see e.g. [35, 89, 20]. It should be noted that a weak formulation444If the objective functional is defined over the surrounding domain then the weak formulation is also an integral over the domain; if it is defined over $\Gamma$ then the weak formulation is an integral over $\Gamma$ , however not in Hadamard form. Using the weak formulation reduces the analytical effort for the derivation of shape derivatives. If the objective functional is a domain integral then using the weak formulation requires an integration over the surrounding domain instead of over $\Gamma$ . Further details as well as additional advantages and drawbacks can be found e.g. in [89, 81, 98, 99]. of the shape derivative is derived as an intermediate result, however in this publication only strong formulations as in Eq. (6) will be considered.

2.5 Examples of shape spaces and their use for shape optimization

Now we shift our focus towards specific spaces which have been used as shape spaces, and metrics on these shape spaces. In this publication, we concentrate on the class of inner metrics, i.e. metrics defined on the shape itself, see Sec. 2.3.

The shape space $\mathcal{B}_{e}$

Among the most common is the shape space often denoted by $\mathcal{B}_{e}$ from [65]. We avoid a mathematical definition here and instead describe it as the following: The shape space $\mathcal{B}_{e}$ contains all shapes which stem from embeddings of the unit circle into the hold-all domain excluding reparametrizations. This space only contains infinitely-smooth shapes (see Fig. 1). It has been shown in [65] that this shape space is an infinite-dimensional Riemannian manifold, which means we can use the previously-described concepts to attain Riemannian shape gradients for the gradient descent algorithm in Algorithm 1 on $\mathcal{B}_{e}$ , but two open questions still have to be addressed: Which Riemannian metric can (or should) we choose as $g$ ? and Which method do we use to convert a direction on the tangential space into movement on the manifold? The latter question has been answered in [84, 29], where a possible retraction on $\mathcal{B}_{e}$ is described as

[TABLE]

i.e. all $\bm{x}\in\Gamma^{i}$ are displaced to $\bm{x}+\bm{v}^{\Gamma}(\bm{x})$ $\forall\bm{x}\in\Gamma^{i}$ . Due to its simplicity of application this is what will be used throughout this paper.

The former question is not so easily-answered. Multiple types of Riemannian metrics could be chosen in order to compute the Riemannian shape gradient, each with its advantages and drawbacks. To introduce the three different classes of Riemannian metrics, we first introduce an option which does not represent a Riemannian metric on $\mathcal{B}_{e}$ .

As has been proven in [65], the standard $L^{2}$ metric on $T_{\Gamma}(\mathcal{B}_{e})$ defined as

[TABLE]

is not a Riemannian metric on $\mathcal{B}_{e}$ because it suffers from the vanishing geodesic distance phenomenon. This means that the whole theory for Riemannian manifolds cannot be used, i.e. it is not guaranteed that the computed “gradient” w.r.t. the $L^{2}$ metric is a steepest descent direction.

Based on the $L^{2}$ metric not being a Riemannian metric on $\mathcal{B}_{e}$ , alternative options have been proposed which do not suffer from the vanishing geodesic distance phenomenon. As described in [98], three groups of $L^{2}$ -metric-based Riemannian metrics can be identified.

Almost local metrics include weights into the $L^{2}$ metric (cf. [7, 10, 66]). 2. 2.

Sobolev metrics include derivatives into the $L^{2}$ metric (cf. [9, 66]). 3. 3.

Weighted Sobolev metrics include both weights and derivatives into the the $L^{2}$ metric (cf. [10]).

The first group of Riemannian metrics can be summarized as

[TABLE]

with an arbitrary function $\Phi$ . As described in [66], this function could be dependent e.g. on the length of the two-dimensional shape to varying degrees, the curvature of the shape, or both.

According to [66], the more common approach falls into the second group. In this group, higher derivatives are used to avoid the vanishing geodesic distance phenomenon. To so-called Sobolev metric exists up to arbitrarily high order. Commonly-used (cf. e.g. [82]) is the first-order Sobolev metric

[TABLE]

with the arc length derivative $\nabla_{\Gamma}$ and a metric parameter $A>0$ . An equivalent metric can be obtained by partial integration and reads

[TABLE]

where $\Delta_{\Gamma}$ represents the Laplace-Beltrami operator. Therefore, the first-order Sobolev metric is also sometimes called the Laplace-Beltrami approach.

The third group combines the previous two, thus a first-order weighted Sobolev metric is given by

[TABLE]

or equivalently,

[TABLE]

As already described in Algorithm 1, the solution of a PDE to obtain the Riemannian shape gradient cannot be avoided. In most cases, the PDE cannot be solved analytically. Instead, a discretizetion has to be used to numerically solve the PDE. However, the discretized domain $\Omega\subseteq D$ in which the shape $\Gamma$ is embedded will not move along with the shape itself, which causes a quick deterioration of the computational mesh. Therefore, the Riemannian shape gradient has to be extended into the surrounding domain. The Laplace equation $\Delta\bm{u}=\bm{0}$ is commonly used for this, with the Riemannian shape gradient as a Dirichlet boundary condition on $\Gamma$ . Then, we call $\bm{u}$ the extension of the Riemannian shape gradient into the domain $\Omega$ , i.e. $\bm{u}^{\Gamma}$ denotes the restriction of $\bm{u}$ to $\Gamma$ .

An alternative approach on $\mathcal{B}_{e}$ that avoids the use of Sobolev metrics has been introduced in [83] and is named Steklov-Poincaré approach, where one uses a member of the family of Steklov-Poincaré metrics $g_{s}(\cdot,\cdot)$ to calculate the shape update. The name stems from the Poincaré-Steklov operator, which is an operator to transform a Neumann- to a Dirichlet boundary condition. Its inverse is then used to transform the Dirichlet boundary condition on $\Gamma$ to a Neumann boundary condition. More specifically, the resulting Neumann boundary condition gives a deformation equivalent to a Dirichlet boundary condition. Let $V(\Omega)$ be an appropriate function space with an inner product defined on the domain $\Omega$ . Then, using the Neumann solution operator $E_{N}(\bm{u}^{\Gamma})=\bm{u}$ , where $\bm{u}$ is the solution of the variational problem $a(\bm{u},\bm{v})=\int_{\Gamma}\bm{u}^{\Gamma}\cdot\bm{v}^{\Gamma}\,\mathrm{d}\Gamma$ $\forall\bm{v}\in V(\Omega)$ , we can combine the Steklov-Poincaré metric $g_{s}$ , the shape derivative $J^{\prime}(\Gamma)(\bm{v})$ , and the symmetric and coercive bilinear form $a(\cdot,\cdot)$ defined on the domain $\Omega$ to determine the extension of the Riemannian shape gradient w.r.t. the Steklov-Poincaré metric into the domain, which we denote by $\bm{u}\in V(\Omega)$ , as

[TABLE]

For further details we refer the interested reader to [98]. Different choices for the bilinear form $a(\cdot,\cdot)$ yield different Steklov-Poincaré metrics, which motivates the expression of the family of Steklov-Poincaré metrics. Common choices for the bilinear form are

[TABLE]

where $\mathcal{D}$ could represent the material tensor of linear elasticity. The extension of the Riemannian shape gradient $\bm{u}$ w.r.t. the Steklov-Poincaré metric $g_{s}$ is directly obtained and can immediately be used to update the mesh in all of $\Omega$ , which avoids the solution of an additional PDE on $\Gamma$ . Additionally, the weak formulation of the shape derivative can be used in equation (13) to simplify the analytical derivation, as already described in Sec. 2.4.

The shape space $\mathcal{B}^{\frac{1}{2}}$

An alternative to the shape space $\mathcal{B}_{e}$ has been introduced in [99]. It is denoted as $\mathcal{B}^{\frac{1}{2}}(\Gamma^{0})$ and it is shown that this shape space is a diffeological space. This shape space contains all shapes which arise from admissible transformations of an initial shape $\Gamma^{0}$ , where $\Gamma^{0}$ is at least Lipschitz-continuous. This is a much weaker requirement on the smoothness of admissible shapes (compared to to the infinitely-smooth shapes in $\mathcal{B}_{e}$ ). An overview of shapes with different smoothness has already been given in Fig. 1. Opposed to optimization on Riemannian manifolds, optimization on diffeological spaces is not yet a well-established topic. Therefore, the main objective for formulating optimization algorithms on a shape space, i.e. the generalization of concepts like the definition of a gradient, a distance measure and optimality conditions, is not yet reached for the novel space $\mathcal{B}^{\frac{1}{2}}(\Gamma^{0})$ . However, the necessary objects for the steepest descent method on a diffeological space are established and the corresponding algorithm is formulated in [34]. It is nevertheless worth to mention that various numerical experiments, e.g. [81, 80, 87, 14], have shown that shape updates obtained from the Steklov-Poincaré metric can also be applied to problems involving non-smooth shapes. However, questions about the vanishing geodesic distance, a proper retraction and the dependency of the space on the initial shape $\Gamma^{0}$ remain open.

The largest-possible space of bi-Lipschitz transformations $W^{1,\infty}(\Omega,\mathbb{R}^{d})$

On finite-dimensional manifolds, the direction of steepest descent555The source gives the direction of steepest ascent, but the direction of steepest descent is defined accordingly. can be described by two equivalent formulations, see [1], and reads

[TABLE]

Instead of solving for the shape gradient $\nabla J(\Gamma)$ , another option to obtain a shape update direction is to solve the optimization problem on the right-hand side of equation (15), but this usually is prohibitively expensive. Introduced in [42] and applied in shape optimization in [19] as the $W^{1,\infty}$ * approach*, it is proposed to approximate the solution to the minimization problem (15) by solving

[TABLE]

while taking $p\to\infty$ with $p>2$ , see [26]. Due to the equivalence to the extension equation as described in [26, 42, 69] in weak formulation

[TABLE]

this PDE can be solved numerically with iteratively increasing $p$ . In a similar fashion to the Steklov-Poincaré approach, we can equate the weak form of the extension equation $a(\bm{u},\bm{v})$ to the shape derivative $J^{\prime}(\Gamma)(\bm{v}^{\Gamma})$ in strong or weak formulation to obtain the shape update direction. In [69], this approach is called the $p$ -harmonic descent approach. The Sobolev space for the extension of the shape update direction $W^{1,\infty}(\Omega,\mathbb{R}^{d})$ is motivated as the largest-possible space of bi-Lipschitz shape updates. However, it is not yet clear which additional assumptions are needed in order to guarantee that a Lipschitz shape update preserves Lipschitz continuity in this manner, see [99, Sec. 3.2] and [39, Sec. 4.1] for further details on this topic. Moreover, a theoretical investigation of the underlying shape space that results in shape update directions from the space $W^{1,\infty}(\Omega,\mathbb{R}^{d})$ is still required. Since neither a manifold structure has been established which would motivate the minimization over the tangent space in equation (15), nor has it been shown that $g_{s}$ is possibly a Riemannian metric for this manifold666There is no inner product defined on $W^{1,p}(\Omega,\mathbb{R}^{d})$ unless $p=2$ and $a(\bm{u},\bm{v})$ does not fulfill the condition of linearity in the arguments unless $p=2$ to classify as a bilinear form. A bilinear form is required for Eq. (13) to hold., it is not guaranteed that equation (13) yields a steepest descent direction in this scenario.

If we assume $W^{1,\infty}(\Omega,\mathbb{R}^{d})$ to be the largest possible space for $\bm{u}$ that yields shape updates conserving Lipschitz continuity, then only $W^{1,\infty}(\Omega,\mathbb{R}^{d})$ itself or subspaces of $W^{1,\infty}(\Omega,\mathbb{R}^{d})$ yield shape updates conserving Lipschitz continuity. For example, when working with the Sobolev metrics of higher order and an extension which does not lose regularity, one needs to choose the order $p$ high enough such that the corresponding solution from the Hilbert space $H^{p}(\Omega,\mathbb{R}^{d})$ is also an element of $W^{1,\infty}(\Omega,\mathbb{R}^{d})$ . The Sobolev embedding theorem yields that this is only the case for $p\geq\frac{d}{2}+1$ . Therefore, one would need to choose at least $p=2$ in two dimensions and $p=3$ in three dimensions. However, this requirement is usually not fulfilled in practice due to the demanding requirement of solving nonlinear PDEs for the shape update direction. Further, already the shape gradient w.r.t. the first-order Sobolev metric is sufficient to meet the above requirement under certain conditions, as described in [98, Sec. 2.2.2].

After introducing the necessary concepts to formulate shape updates from a theoretical perspective, we will now reiterate these concepts in the next section with a focus on applicability.

3 Parameter-free shape optimization in engineering

In an engineering application, the shape $\Gamma$ to be optimized may be associated with a computational domain $\Omega$ in different ways as illustrated in Fig. 3. Independently of this setting the main goal of an optimization algorithm is not only to compute updated shapes $\Gamma^{i+1}$ from a given shape $\Gamma^{i}$ such that $J(\Gamma^{i+1})<J(\Gamma^{i})$ but also to compute updated domains $\Omega^{i+1}$ that preserve the quality of a given discretization of $\Omega^{i}$ . Similar to the updated shape according to the perturbation of identity, the updated domain is computed as

[TABLE]

which is applied in a discrete sense, e.g. by a corresponding displacement of all nodes by $\alpha\,\bm{\theta}$ . Summarizing the elaborations in the previous section, a gradient descent algorithm that achieves a desired reduction of the objective functions involves four steps that compute

the objective function $J(\Gamma^{i})$ and its shape derivative $J^{\prime}(\Gamma^{i})(\bm{v}^{\Gamma})$ , 2. 2.

the shape update direction $\bm{\theta}^{\Gamma}$ (the negative shape gradient $-\bm{u}^{\Gamma}$ ), 3. 3.

the domain update direction $\bm{\theta}$ (the extension of the negative shape gradient $-\bm{u}$ ), 4. 4.

a step size $\alpha$ and an updated domain $\Omega^{i+1}$ .

We introduce $\bm{\theta}^{\Gamma}$ and $\bm{\theta}$ here in a general way as shape update direction and domain update direction, respectively, because not all approaches yield an actual shape gradient according to its definition in Eq. (4). In the remainder of this section, we focus on Step 2 – 4 starting with a description of several approaches to compute $\bm{\theta}^{\Gamma}$ in a simplified way that allows for a direct application. Some approaches combine Steps 2 and 3 and directly yield the domain update direction $\bm{\theta}$ . For all other approaches, the extension is computed separately as explained at the end of this section, which includes an explanation of the step size control.

We do not give details about Step 1 (the computation of the shape derivative $J^{\prime}(\Gamma_{i})$ ) and refer to the literature cited in Sec. 1 about the derivation of adjoint problems in order to compute $J^{\prime}(\Gamma)$ in an efficient way independently of the number of design variables. However, we assume that the objective function is given as

[TABLE]

which is the case for all problems considered in this work and arises in many engineering applications as well. Further, we assume that the shape derivative is given in the strong formulation (see Eq. (6)). The main input for Step 2 is accordingly the sensitivity distribution $s$ .

3.1 Shape and domain update approaches

Before collecting several approaches for the computation of a shape update direction $\bm{\theta}^{\Gamma}$ from a sensitivity $s$ we would like to give some general remarks about why the computed directions are reasonable candidates for a shape update that yields a reduction of $J$ . To this end, the definition of the shape derivative in Eq. (5) can be used to obtain a first-order approximation

[TABLE]

Using the expression of the shape derivative from Eq. (6) and setting $\bm{\theta}^{\Gamma}=-\bm{n}\,s$ , one obtains

[TABLE]

which formally shows that a decrease of the objective function can be expected at least for small $\alpha$ . However, several problems arise when trying to use $\bm{\theta}^{\Gamma}=-\bm{n}\,s$ in practice and in theory, when used for further mathematical investigations as detailed in Sec. 2. An obvious practical problem is that neither $\bm{n}$ nor $s$ can be assumed to be smooth enough such that their product and the subsequent extension result in a valid displacement field $\bm{\theta}$ that can be applied according to Eq. (18). All approaches considered here overcome this problem by providing a shape update direction $\bm{\theta}^{\Gamma}$ , which is smoother than $\bm{n}\,s$ . Several approaches make use of the Riemannian shape gradient $\bm{u}^{\Gamma}$ as defined in Eq. (4) for this purpose. A corresponding first-order approximation reads

[TABLE]

Setting $\bm{\theta}^{\Gamma}=-\bm{u}^{\Gamma}$ , one obtains

[TABLE]

which shows that also these approaches yield a decrease in the objective function provided that $\alpha$ is small.

3.1.1 Discrete filtering approaches

Several authors successfully apply discrete filtering techniques to obtain a smooth shape update, see e.g. [91, 16, 48]. As the name suggests, they are formulated based on the underlying discretization, e.g. on the nodes or points $\bm{x}_{n}$ on $\Gamma$ and the sensitivity at these points $s_{n}=s(\bm{x}_{n})$ . The shape update direction at the nodes, i.e. the direction of the displacement to be applied there, is computed by

[TABLE]

Therein, $w_{n,j}$ denotes the weight and $N_{n}$ is the set indices of nodes in the neighborhood of node $n$ . We introduce a particular choice for the neighborhoods $N_{n}$ and the weights $w_{n,j}$ in Sec. 4 and denote it as Filtered Sensitivity (FS) approach.

The discrete nature of a filter according to Eq. (24) demands for a computation of a normal vector $\bm{n}_{n}$ at the nodal positions. Since $\bm{n}(\bm{x}_{n})$ is not defined, a special heuristic computation rule must be applied. In the example considered in Sec. 4, the nodes on $\Gamma$ are connected by linear edges, and we compute the normal vector $\bm{n}_{n}$ as the average of normal vectors $\bm{n}^{e_{1}}$ and $\bm{n}^{e_{2}}$ of the two adjacent edges,

[TABLE]

An analogue computation rule is established for the three-dimensional problem considered in Sec. 5. In this discrete setting, it also becomes possible to directly use the sensitivity and the normal vector as a shape update direction, even for non-smooth geometries. It is just a special case of (24) using a neighborhood $N_{n}=\left\{n\right\}$ and weight $w_{n,n}=1$ , which results in $\bm{\theta}^{\Gamma}_{n}=-\bm{n}_{n}\,s_{n}$ . The resulting approach is denoted here as the direct sensitivity (DS) approach.

We would like to emphasize that the corresponding choice in the continuous setting $\bm{\theta}^{\Gamma}=-\bm{n}\,s$ that led to Eq. (21) cannot be applied for the piece-wise linear shapes that arise when working with computational meshes – the normal vectors at the nodal points are simply not defined. The same problem arises for any shape update in normal direction. However, we include such methods in our study, because they are widely used in literature and can be successfully applied when combined with a special computation rule for the normal direction at singular points like Eq. (25). It is noted that having computed $\bm{\theta}^{\Gamma}$ according to the FS or DS approach one needs to extend it into the domain to obtain $\bm{\theta}$ as described in Sec. 3.2.

Finally, we would like to point out that in an application scenario, also the continuously-derived shape update directions eventually make use of a discrete update of nodal positions (Sec. 4) or cell centers (Sec. 5). Accordingly, all approaches – including those introduced in the following sections – finally undergo an additional discrete filtering.

3.1.2 Laplace-Beltrami approaches

A commonly applied shape update is based on the first-order Sobolev metric (see Eq. (10)), which yields as an identification problem for the shape gradient:

[TABLE]

We denote the constitutive parameter $A$ as conductivity here. A strong formulation involves the tangential Laplace-Beltrami operator $\Delta_{\Gamma}$ suggesting the name for this type of approach. Formulated as a boundary value problem it reads

[TABLE]

This auxiliary problem yields $\bm{u}^{\Gamma}$ on $\Gamma^{\mathrm{d}}$ , while on $\Gamma\setminus\Gamma^{\mathrm{d}}$ we set $\bm{u}^{\Gamma}=\bm{0}$ . Means to extend $\bm{\theta}^{\Gamma}=-\bm{u}^{\Gamma}$ into the domain to obtain $\bm{\theta}$ , respectively $\bm{u}$ , are described in Sec. 3.2. We denote this approach as Vector Laplace Beltrami (VLB) in the following. Due to the fact that $\Delta_{\Gamma}$ operates only in the tangential direction, the components of $s\,\bm{n}$ are mixed, such that $\bm{\theta}^{\Gamma}$ is not parallel to $\bm{n}$ , see [48, 91] for further details.

As an alternative, we consider a scalar variant of the VLB approach applied in [37] and call it Scalar Laplace Beltrami (SLB) in the following. A scalar field $\bar{u}$ is computed using the tangential Laplace Beltrami operator and the sensitivity $s$ as a right-hand side:

[TABLE]

As a shape update direction $\bm{\theta}^{\Gamma}=-\bar{u}\,\bm{n}$ is taken. As in the VLB case, some smoothness is gained in the sense that $\bar{u}$ is smoother than $s$ . However, this choice has the same shortcomings as any direction that is parallel to the normal direction. It is further noted that the discrete filtering approach from Sec. 3.1.1 is equivalent to a finite-difference approximation of the VLB method, if the weights in Eq. (24) are chosen according to the bell-shaped Gaussian function, see [16, 48].

3.1.3 Steklov-Poincaré approaches

As mentioned in Sec. 2, these approaches combine the identification of $\bm{\theta}^{\Gamma}$ and the computation of its extension into the domain. This leads to an identification problem, similar to Eq. (26), however, now using a function space $V(\Omega)$ defined over the domain $\Omega$ and a bilinear form $a(\cdot,\cdot)$ on $\Omega$ instead of an inner product $g(\cdot,\cdot)$ on $\Gamma$ . Choosing the second bilinear form from Eq. (14), the identification problem for the shape gradient reads

[TABLE]

If $\mathcal{D}$ is chosen as the constitutive tensor of an isotropic material, Eq. (31) can be interpreted as a weak formulation of the balance of linear momentum. In this linear elasticity context, $s\,\bm{n}$ plays the role of a surface traction. Appropriately in this regard, the approach is also known as the traction method, see e.g. [6, 5]. To complete the formulation, the constitutive tensor is expressed as

[TABLE]

where $\mathcal{T}$ denotes the fourth order tensor that yields the trace ( $\mathcal{T}\,\bm{A}=\mathrm{tr}\left(\bm{A}\right)\,\bm{I}$ ), $\mathcal{S}$ is the fourth order tensor that yields the symmetric part ( $\mathcal{S}\,\bm{A}=\frac{1}{2}\left(\bm{A}+\bm{A}^{\mathrm{T}}\right)$ ) and $\lambda$ and $\mu$ are the Lamé constants. Suitable choices for these parameters are problem-dependent and are usually chosen, such that the quality of the underlying mesh is preserved as good as possible. Through integration by parts, a strong formulation of the identification problem can be obtained that further needs to be equipped with Dirichlet boundary conditions to arrive at

[TABLE]

We will refer to this choice as Steklov-Poincaré structural mechanics (SP-SM) in the following. An advantage is the quality of the domain transformation that is brought along with it – a domain that is perturbed like an elastic solid with a surface load will likely preserve the quality of the elements that its discretization is made of. Of course, the displacement must be rather small, as no geometric or physical nonlinearities are considered. Further, the approach makes it possible to use weak formulations of the shape derivative as mentioned in Sec. 2.4. To this end, the integrand in the shape derivative can then be interpreted as a volume load in the elasticity context and applied as a right-hand side in (33).

Diverse alternatives exist that employ an effective simplification of the former. In [49] the spatial cross coupling introduced by the elasticity theory is neglected and a spatially varying scalar conductivity is introduced. The conductivity is identified with the inverse distance to the boundary such that

[TABLE]

where $\mathcal{I}$ denotes the fourth order identity tensor and $w$ refers to the distance to the boundary. A small value $\varepsilon$ is introduced to circumvent singularities for points located on the wall. In the sequel, we denote this variant as Steklov-Poincaré wall distance (SP-WD). It is emphasized that it is now a diffusivity or heat transfer problem that is solved, instead of an elasticity problem. More precisely, $d$ decoupled diffusivity or heat transfer problems are solved – one for each component of $\bm{u}=\left[u_{1}\leavevmode\nobreak\ u_{2}\leavevmode\nobreak\ u_{3}\right]$ – since with (36) the PDE (33) reduces to

[TABLE]

For completeness, we would like to refer to an alternative from [70] that introduces a nonlinearity into the identification problem (31). Another choice for $\mathcal{D}$ employed in [80, 28] is $\mathcal{D}=2\,\mu\,\mathcal{S}$ , where $\mu$ is set to a user-defined maximum value on $\Gamma^{\mathrm{d}}$ and a minimum value on the remaining part of the boundary. Values inside $\Omega$ are computed as the solution of a Laplace equation such that the given boundary values are smoothly interpolated. However, we do not consider these choices in our investigations in Sections 4 and 5.

3.1.4 $p$ -harmonic descent approach

As introduced at the end of Sec. 2.5, the $p$ -harmonic descent approach (PHD) yields another identification problem for the domain update direction $\bm{\theta}^{*}$ as given in Eq. (17). A minor reformulation yields

[TABLE]

A strong form of the problem reads

[TABLE]

The domain update direction is then taken to be $\bm{\theta}=-\frac{1}{\alpha}\bm{u}$ . Due to the nonlinearity of (39) we have introduced the scaling parameter $\alpha$ here. In the scope of an optimization algorithm $\alpha$ represents a step size and may be determined by a step size control. All other approaches introduced above establish a linear relation between $s$ and $\bm{\theta}$ such that the scaling can be done independently of the solution of the auxiliary problem. For the PHD approach, Problem (39 – 41) may need to be solved repeatedly in order to find the desired step size.

The main practical advantage of this choice is the parameter $p$ , which allows to get arbitrarily close to the case of bi-Lipschitz transformations $W^{1,\infty}(\Omega,\mathbb{R}^{d})$ . Sharp corners can therefore be resolved arbitrarily close as discussed in Sec. 2 and demonstrated in [69, 19]. Another positive aspect demonstrated therein is that the PHD approach yields comparably good mesh qualities. Like the SP approaches the PHD approach further allows for a direct utilization of a weak formulation of the shape derivative.

3.2 Mesh morphing and step size control

Several methods are commonly applied to extend shape update directions $\bm{\theta}^{\Gamma}$ obtained from the approaches DS, FS, VLB, and SLB into the domain. For example, interpolation methods like radial basis functions may be used, see e.g. [37]. Another typical choice is the solution of a Laplace equation, with $\bm{\theta}$ as its state and $\bm{\theta}^{\Gamma}$ as a Dirichlet boundary condition on $\Gamma^{\mathrm{d}}$ for this purpose, see e.g. [61]. We follow a similar methodology and base our extension on the general PDE introduced for the Steklov-Poincaré approach. The boundary value problem to be solved when applied in this context reads

[TABLE]

As a constitutive relation, we choose again linear elasticity (see Eq. (32)) or component-wise heat transfer (see Eq. (36)). Once a deformation field is available in the entire domain, its discrete representation can be updated according to Eq. (18). It is recalled here that the domain update direction $\bm{\theta}$ can be computed independently of the step size $\alpha$ for all approaches except for the PHD approach, where it has a nonlinear dependence on $\alpha$ , see Sec. 3.1.4.

In order to compare different shape updates, we apply a step size control. We follow two different methods to obtain a suitable step size $\alpha$ for the optimization.

We perform a line search, where $\alpha$ is determined by a divide and conquer approach such that $J(\Omega^{i+1})$ is minimized. By construction, the algorithm approaches the optimal value from below and leads to the smallest $\alpha>0$ that yields such a local minimum. If the mesh quality falls below a certain threshold, the algorithm quits before a minimum is found and yields the largest $\alpha$ , for which the mesh is still acceptable. For all considered examples and shape update directions, this involves repeated evaluations of $J$ . For the PHD approach, it further involves repeated computations of $\bm{\theta}$ . 2. 2.

We prescribe the maximum displacement for the first shape update $\theta^{\mathrm{max}}=\underset{\bm{x}\in\Omega_{0}}{\max}\|\alpha\,\bm{\theta}(\bm{x})\|$ . This does not involve evaluations of $J$ but for the PHD approach it involves again repeated computations of $\bm{\theta}$ . For all other methods, we simply set

[TABLE]

Because we aim at a comparison of the different approaches to compute a shape update rather than an optimal efficiency of the steepest descent algorithm, we do not make use of advanced step size control strategies such as Armijo backtracking.

As mentioned in the previous section, the evaluation of the shape update direction depends on the application and the underlying numerical method. In particular, the evaluation of the normal vector $\bm{n}$ is a delicate issue that may determine whether or not a method is applicable. We include a detailed explanation of the methods used for this purpose in Sections 4 and 5.

4 Illustrative test case

In order to investigate the different shape and domain updates, we consider the following unconstrained optimization problem.

[TABLE]

where

[TABLE]

The graph of $f$ is shown in Fig. 4, including an indication of the curve, where $f=0$ , i.e. the level-set of $f$ . Since inside this curve, $f\leq 0$ and outside $f>0$ , the level-set is exactly the boundary of the minimizing domain. Through the term that is multiplied by $C_{1}$ , a singularity is introduced – if $C_{1}\neq 0$ , the optimal shape has two kinks, while it is smooth for $C_{1}=0$ . Through the term that is multiplied by $C_{2}$ , high-frequency content is introduced. Applying the standard formula for the shape derivative (see e.g. [2]), we obtain

[TABLE]

such that $s=f$ .

We start the optimization process from a smooth initial shape – a disc with outer radius $R=1$ and inner radius $r=0.3$ . The design boundary $\Gamma^{\mathrm{d}}$ corresponds to the outer boundary only, the center hole is fixed. This ensures the applicability of the SP-SM approach, which can only be applied as described if at least rigid body motions are prevented by Dirichlet boundary conditions. This requires $\Gamma\backslash\Gamma^{\mathrm{d}}\neq\emptyset$ in the corresponding auxiliary problem (Eqs. (33–34)).

We perform an iterative algorithm to solve the minimization problem by successively updating the shape (and the domain) using the various approaches introduced in Sec. 3. For a fair comparison of the different shape and domain updates the line search technique sketched in Section 3.2 is used to find the step size $\alpha$ that minimizes $J(\Gamma^{i+1})$ for a given $\bm{\theta}$ , i.e. the extension of $\bm{\theta^{\Gamma}}$ into the domain is taken into account when determining the step size $\alpha$ .

4.1 Discretization

We discretize the initial domain using a triangulation and in a first step keep this mesh throughout the optimization. In a second step, re-meshing is performed every third optimiation iteration and additionally, whenever the line search method yields a step size smaller than $10^{-6}$ . The boundary is accordingly discretized by lines (triangle edges). In order to practically apply the theoretically infeasible shape updates, which are parallel to the boundary normal field, the morphing of the mesh is done based on the nodes. A smoothed normal vector is obtained at all boundary nodes by averaging the normal vectors of the two adjacent edges. The sensitivity $s$ is evaluated at the nodes as well and then used in combination with the respective auxiliary problem to obtain the domain update direction $\bm{\theta}$ , respectively $\bm{\theta}^{\Gamma}$ at the nodes. The evaluation of the integral in $J$ is based on values at the triangle centers.

The auxiliary problems for the choices from Section 3.1.2 (VLB, and SLB) are solved using finite differences. Given $\bm{u}^{\Gamma}$ , the tangential divergence at a boundary node $j$ is approximated based on the adjacent boundary nodes by

[TABLE]

where $h_{j}=\|\bm{x}_{j}-\bm{x}_{j-1}\|$ denotes the distance between nodes $j$ and ${j-1}$ .

The auxiliary problems for the choices from Sections 3.1.3 and 3.1.4 (SP-SM, SP-TM, and PHD) are solved with the finite element method. Isoparametric elements with linear shape functions based on the chosen triangulation are used. Dirichlet boundary conditions are prescribed by elimination of the corresponding degrees of freedom.

The auxiliary problem (42-44) needed in combination with all choices from Section 3.1 that provide only $\bm{\theta}^{\Gamma}$ (DS, FS, VLB, SLB) is solved using the same finite-element method. All computations are done in MATLAB [62]. The code is available through http://collaborating.tuhh.de/M-10/radtke/soul.

4.2 Results

Figure 5 illustrates the optimization process with and without remeshing for a coarse discretization to give an overview. The mean edge length is set to $h=0.1$ for this case. In the following, a finer mesh with $h=0.05$ is used if not stated differently. Preliminary investigations based on a solution with $h=0.01$ show that the approximation error when evaluating $J$ drops below $10^{-6}$ then.

To begin with, we consider the smooth case without high frequency content, i.e. $C_{1}=0$ and $C_{2}=0$ . Figure 6 (left) shows the convergence of $J$ over the optimization iterations for the different approaches to compute the shape update. For this particular example, the DS approach yields the fastest reduction of $J$ , while the $PHD$ yields the slowest. In order to ensure that the line search algorithm works correctly and does not terminate early due to mesh degeneration, a check was performed as shown in Figure 6 (right). The thin lines indicate the values of $J$ that correspond to steps with sizes from [math] to $2\,\alpha$ . It can be seen that the line search iterations did not quit early but lead to the optimal step size at all times.

The progression of the norm of the domain update direction and the step size is shown in Fig. 7. More precisely, we plot there the mean norm of the displacement of all nodes on the boundary, i.e.

[TABLE]

where $N^{\mathrm{n}}$ is the total number of nodes on the boundary. As expected, $G$ converges to a small value, which yields no practical shape updates anymore after a certain number of iterations.

4.2.1 Behavior under mesh refinement

While we have ensured that the considered discretizations are fine enough to accurately compute the cost functional in a preliminary step, the effect of mesh refinement on the computed optimal shape shall be looked at more closely. To this end, the scenario $C_{1}=0$ and $C_{2}=0$ considered so far does not yield new insight. All methods successfully converged to the same optimal shape as shown in Fig. 5 and the convergence behavior was indistinguishable from that shown in Fig. 6. This result was obtained with and without remeshing.

For the scenario $C_{1}=1$ and $C_{2}=0$ with sharp corners (see Fig. 4), different behaviors were observed. Figure 8 shows the convergence of the objective functional (left) and final shapes obtained with the different shape updates. All shapes are approximately equal except in the region of the sharp corners on the $x$ -axis close to $x=-1$ and on the $y$ -axis close to $y=-1$ .

Figure 9 shows a zoom into the region of the first sharp corner for the final shapes obtained with different mesh densities. It is observed that only the DS approach resolves the sharp corner while all other approaches yield smoother shapes. For further mesh refinements the obtained shapes were indistinguishable from those shown in Figure 9 (right).

Next we consider the scenario $C_{1}=0$ and $C_{2}=1$ , which introduces high-frequency content into the optimal shape. The high-frequency content may be interpreted in two different ways, when making an analogy to real world applications.

It may represent a numerical artifact, arising due to the discretization of the primal and the adjoint problem (we do not want to find it in the predicted optimal shape then). 2. 2.

It may represent physical fluctuations, e.g. due to a sensitivity that depends on a turbulent flow field (we do not want to find it in the predicted optimal shape then). 3. 3.

It may represent the actual and desired optimal shape (we want to find it in the predicted optimal shape).

With this being said, no judgement about the suitability of the different approaches can be made. Depending on the interpretation, a convergence to a shape that includes the high-frequency content can be desired or not.

Fig. 10 shows the shapes obtained with selected approaches when refining the mesh. The approaches FS, SLB and PHD were excluded because they yield qualitatively the same results as the SP-SM approach, i.e. convergence to a smooth shape without high frequency content. In order to illustrate the influence of the conductivity $A$ , three variants are considered for the VLB approach. For a large conductivity of $A=1$ , the obtained shape is even smoother than that obtained for the SP-SM approach, while $A=0.1$ (the value chosen so far in all studies) yields a similar shape. Reducing the conductivity to $A=0.01$ , the obtained shape is similar to that obtained for the DS approach, which does resolve the high frequency content.

4.2.2 Behavior for a non-smooth initial shape

Finally, we test the robustness of the different shape updates by starting the optimization process from a non-smooth initial shape. A corresponding mesh is shown in Fig. 11 (left). The convergence behavior in Fig. 11 (right) already indicates that not all approaches converged to the optimal shape. Instead, the DS and the SLB approach yield different shapes with a much higher value of the objective functional.

Figure 12 provides an explanation for the convergence behavior. After the first iteration, the DS and the SLB approach show a severe mesh distortion in those regions, where the initial shape had a sharp corner (see Fig. 11 (left)). In order to prevent at least self-penetration of the triangular elements, the step sizes become very small for the following iterations and after 9 (for DS) or 8 (for SLB) iterations, no step sizes larger that $10^{-6}$ could be found that reduce the objective functional. Opposed to that, the FS and the SP approach yield shapes which are very close to the optimal shape. Still, the initial corners are visible also for these approaches, not only due to the distorted internal mesh but also as a remaining corner in the shape, which is more pronounced for the FS approach. The VLB and the PHD approach behave very similar to the SP approach and are therefore not shown here.

We would like to emphasize that even if different approaches yield approximately the same optimal shape, the intermediate shapes, i.e. the path taken during the optimization, may be fundamentally different as apparent in Fig. 12. This is to be kept in mind especially when comparing the outcome of optimizations with different shape updates that had to be terminated early. e.g. due to mesh degeneration, which is the case for several of the studies presented in the next section.

5 Exemplary applications

In this section we showcase CFD-based shape optimization applications on a 2D and 3D geometry, while considering the introduced shape update approaches. Emphasis is given to practical aspects and restrictions that arise during an optimization procedure. The investigated applications refer to steady, laminar internal and external flows. The optimization problems are constrained by the Navier-Stokes (NS) equations of an incompressible, Newtonian fluid with density $\rho$ and dynamic viscosity $\mu$ , viz.

[TABLE]

where, $\bm{u}$ , $p$ , $\bm{S}=1/2(\nabla\bm{u}+(\nabla\bm{u})^{\mathrm{T}})$ and $\bm{I}$ refer to the velocity, static pressure, strain-rate tensor and identity tensor, respectively. The adjoint state of (51)-(52) is defined by the adjoint fluid velocity $\hat{\bm{u}}$ and adjoint pressure $\hat{p}$ that follow from the solution of

[TABLE]

where, $\hat{\bm{S}}=1/2(\nabla\hat{\bm{u}}+(\nabla\hat{\bm{u}})^{\mathrm{T}})$ refers to the adjoint strain rate tensor.

The employed numerical procedure refers to an implicit, second-order accurate finite-volume method (FVM) using arbitrarily-shaped/structured polyhedral grids. The segregated algorithm uses a cell-centered, collocated storage arrangement for all transport properties, cf. [77]. The primal and adjoint pressure-velocity coupling, which has been extensively verified and validated [92, 47, 52, 50, 15], follows the SIMPLE method, and possible parallelization is realized using a domain decomposition approach [101, 102]. Convective fluxes for the primal [adjoint] momentum are approximated using the Quadratic Upwind [Downwind] Interpolation of Convective Kinematics (QUICK) [QDICK] scheme [92] and the self-adjoint diffusive fluxes follow a central difference approach.

The auxiliary problems of the various approaches to compute a shape update are solved numerically using the finite-volume strategies described in the previously-mentioned publications. Accordingly, $\bm{\theta}$ is computed at the cell centers $\bm{c}_{c}$ in a first step. In a second step, it needs to be mapped to the nodal positions $\bm{x}_{n}$ , which is done using an inverse distance weighting, also known as Shepard’s interpolation [86]. We use $\bm{\theta}_{n}$ to denote the value at a node

[TABLE]

Therein, $C_{n}$ contains the $N_{n}^{\mathrm{c}}$ indices of all adjacent cells at node $n$ . After the update of the grid, geometric quantities are recalculated for each FV. Topological relationships remain unaltered and the simulation continues by restarting from the previous optimization step to evaluate the new objective functional value. Due to the employed iterative optimization algorithm and comparably small step sizes, field solutions of two consecutive shapes are usually nearby. Compared to a simulation from scratch, a speedup in total computational time of about one order of magnitude is realistic for the considered applications.

5.1 Two-dimensional flow around a cylinder

We consider a benchmark problem which refers to a fluid flow around a cylinder, as schematically depicted in Fig. 13(a). This application targets to minimize the flow-induced drag of the cylinder by optimizing parts of its shape. The objective $J(\Gamma)$ and its shape derivative read

[TABLE]

where $\bm{e}_{1}$ denotes the basis vector in the $x$ -direction (the main flow direction), see [52] for a more detailed explanation. Note that the objective is evaluated along the complete circular obstacle $\Gamma$ , but its shape derivative is evaluated only along the section under design $\Gamma^{\mathrm{d}}$ as shown in Fig. 13(a). The decision of optimizing a section of the obstacle’s shape instead of the complete shape is made to avoid trivial solutions such as, e.g. a singular point or a straight line without the need for applying additional geometric constraints.

The steady and laminar study is performed at $\mathrm{Re}_{D}=\rho\,U_{\mathrm{in}}\,D/\mu=$ 20\text{,}$$ based on the cylinder’s diameter $D$ and the inflow velocity $U_{\mathrm{in}}$ . The two-dimensional domain has a length and height of $40\,D$ and $20\,D$ , respectively. At the inlet, velocity values are prescribed, slip walls are used along the top as well as bottom boundaries and a pressure value is set along the outlet.

To ensure the independence of the objective functional $J$ and its shape derivative $J^{\prime}$ in Eq. (56) w.r.t. the spatial discretization, a grid study is first conducted, as presented in Tab. 1. Since the monitored integral quantities do not show a significant change from refinement level 4 on, level 3 is employed for all following optimizations. A detail of the utilized structured numerical grid is displayed in Fig. 13 (b) and consists of approximately $\mathrm{1}\mathrm{9}\mathrm{0}\mathrm{0}\mathrm{0}$ control volumes. The cylinder is discretized with 200 surface patches along its circumference.

In contrast to the theoretical framework, we now have to take into consideration further practical aspects in order to realize our numerical optimization process. A crucial aspect that needs to be taken into account in any CFD simulation is the quality of the employed numerical grid. As the optimization progresses, the grid is deformed on the fly rather than following a remeshing approach. Hence, we have to ensure that the quality of the mesh is preserved to such an extent that the numerical solution converges and produces reliable results. An intuitive method to ensure that grid quality is not heavily deteriorated is to restrict large deformations by using a small step size $\alpha$ .

In the numerical investigations of the 2D case, the step size remains constant through the optimization process and is determined by prescribing the maximum displacement in the first iteration ( $\theta^{\mathrm{max}}$ ) as described in Sec. 3.2. We set it to two percent of the diameter of the cylinder, i.e. $\theta^{\mathrm{max}}=0.02\,D$ , based on the experience of the authors on this particular case, cf. [53]. Further investigations in combination with the line search method are presented in Appendix A.

5.1.1 Results

The investigated approaches are DS, VLB with $A=0.1D$ , VLB with $A=0.5D$ , VLB with $A=D$ , SP-WD and PHD. For all approaches that yield $\bm{\theta}^{\Gamma}$ only, the extension into the domain is done as described in Sec. 3.2 (see Eq. (42)) with a constitutive relation based on Eq. (36). Figure 14(a) shows the relative decrease of $J(\Gamma)$ w.r.t. the initial shape, for all aforementioned approaches. As it can be seen, the investigated domain expressions SP-WD & PHD managed to reach a reduction greater than 9% while the remaining boundary expressions fell shorter at a maximum reduction of 8.2% by the DS approach. In the same figure one can notice, that none of the employed approaches managed to reach a converged state with its applied constant step size. The reason behind this shortcoming is shown in Fig. 14(b) where the minimum orthogonality of the computational mesh is monitored during the optimization runs. In all cases, mesh quality is heavily deteriorated during the final steps of the optimization algorithm leading to unusable computational meshes. This is partially attributed to the selected section of design ( $\Gamma^{\mathrm{d}}$ ) and the mesh update approach, as described by Eq. (55). A natural question that one may ask by virtue of Eq. (55) is what happens at nodes connecting a design and a non-design surface patch. To this end, we present Fig. 15, in which we show the discretized rightmost connecting section of the cylinder between the aforementioned surfaces at the end of the optimization process of VLB - $A=0.1D$ . As can be seen, a sharp artificial kink appears at the connection between design and non-design surfaces. This is due to the displacement of the connecting vertex, which is computed based on contributions of all adjacent surface patches, as illustrated in Fig. 15(b). Therefore, if our auxiliary problem results in shape updates that do not smoothly fade out to zero at the connection between a design and non-design boundary, a kink is bound to appear. A resulting significant deterioration of the surrounding mesh leads to a premature termination of the computational study due to divergence of the primal or adjoint solver. This exact behavior, even though it is noticed for all shape updates, appears earlier or later w.r.t. the complete optimization run.

Furthermore, it is interesting to note that the shapes found by each metric differ significantly and the paths towards them as well. This is shown in Fig. 16. We note that SP-WD and PHD result in smoother solutions while shapes produced by the VLB approach become less and less smooth as $A$ decreases. Note that in the limit $A\xrightarrow{}0$ , VLB is equivalent to DS (see Eq. (27)).

5.1.2 Step size control through line search

Similar to the illustrative test case of Sec. 4, we apply the line search technique described in Sec. 3.2 to find an optimal step size for the 2D cylinder application. Due to significant numerical effort needed to test different step sizes, we restrict our investigations to the SP-WD and DS approach. Figure 17 (a) shows the dependence of the objective functional $J(\Gamma^{i+1})$ on the step size for the first two optimization iterations. Contrary to the illustrative test case, we cannot reach a step size in which $J$ starts increasing. Instead, the line search ends early, due to a low mesh quality. In particular, we monitor the minimum mesh orthogonality and quit at a threshold of $45^{\circ}$ . This choice is confirmed by the results shown in Fig. 17 (b) where for most descent directions, a rapid deterioration of the mesh is noticed after $45^{\circ}$ .

This study highlights the significant numerical restrictions that one may face when considering CFD-based shape optimization studies. While preferably, we would like to employ the optimal step size for each descent direction, we are inevitably restricted by the quality of the employed mesh. To this extent, one may pose the question of what the optimal balance between an extensive mesh refinement - which implies increased computational effort - and a straightforward, experienced-based choice of the step size is. An answer to such a question stems from the goal of the optimization at hand and the available computational resources of the user.

5.2 Three-dimensional flow through a double-bent duct

The second test case examines a more involved, three-dimensional, double-bent duct as shown in Fig. 18. The flow has a bulk Reynolds-number of $\mathrm{Re_{D}}=\rho UD/\mu=500$ where $U$ and $D$ refer to the bulk velocity as well as the inlet diameter, respectively. Along the inlet, a uniform velocity profile is imposed and a zero pressure value is prescribed at the outlet. The ducted geometry is optimized w.r.t. the total power loss, i.e.

[TABLE]

for which the corresponding shape derivative $J^{\prime}(\Gamma)(\bm{\theta})$ corresponds to that of the previous section, see Eq. (56). A detailed explanation of the adjoint problem including boundary conditions is provided in [92, 51].

Like for the two-dimensional flow, a grid study is first conducted, as presented in Tab. 2. In order to enable a computationally feasible study as well as ensure a reliable estimation of the objective, level 2 is employed for all cases presented hereafter. This corresponds to a structured numerical grid of 90000 control volumes. Three diameters downstream of the inlet, the curved area is free for design and discretized with 5600 surface elements and the numerical grid is refined towards the transition region between design and non-design wall as depicted in Fig. 19.

During the optimization of the 3D case, the step size remains constant through the process and is determined by prescribing the maximum displacement in the first iteration ( $\theta^{\mathrm{max}}$ ) as described in Sec. 3.2. We set it to one percent of the initial tube’s diameter, i.e. $\theta^{\mathrm{max}}=0.01D$ . The investigated shape and domain updates are DS, SLB with $A/D=1$ , VLB with $A/D=1$ , SP-WD and PHD with $p=4$ . Here $A$ is used in a similar context as in Sec. 5.1. All investigated shape updates are extended into the domain as in the two-dimensional case.

5.2.1 Results

Figure 20 (a) shows the relative decrease of $J(\Omega)$ w.r.t. the initial shape. A stopping criterion of the optimization runs is fulfilled when the relative change of the objective functional between two domain updates falls below $0.1\%$ , i.e when $(J_{i}-J_{i-1})/J_{i-1}\cdot 100\%<0.1\%$ .

The investigated boundary-based approaches SLB & VLB managed to reach a reduction greater than 40%, which corresponds to the SP-WD gain. The PHD approach minimizes the cost functional by $\approx 36\%$ which is still 10% more than the DS approach, that terminated due to divergence of the primal solver after 42 iterations. The reason for termination is the divergence of the primal solver due to insufficient mesh quality, as already described in the previous section. Note that solver settings like relaxation parameters etc., are the same for all simulations during all optimizations.

The degraded grid quality within the DS procedure can be anticipated from the representation of the shape update direction in Fig. 21 (a). Compared to the shape updates in (b) SLB, (c) SP-WD, and (d) PHD with $p=4$ , a rough shape update field is apparent for the DS approach, especially in the straight region between the two tube bends. It is noted that the figure is based on the cell-centered finite-volume approximation, and the results have to be interpolated to the CV vertices using Eq. (55). This procedure results in a smoothing, which allows the numerical process to perform at least a few shape updates without immediate divergence of the solver. Compared to the DS approach, the shape update is significantly smoother for the SLB approach with a filter width of $A/D=1$ , cf. Fig. 21 (b). Even smoother shape changes follow from the remaining approaches, with comparatively little difference in the respective deformation field between SP-WD and PHD in the region between the tube’s bents.

Perspective views of the final shapes obtained with the four different approaches are shown in Fig. 22. Again, it can be seen that the DS approach (a) results in local dents in the region between the bends, which is ultimately the reason for the divergence of the SIMPLE solver after a few iterations. On the other hand, shape updates of the SLB, SP-WD, and PHD approaches are all smooth but still noticeably different.

The results in Fig. 22 are consistent with the expectation that an increased volume should accompany a reduction in pressure drop. The fact that the different shape update approaches yield different final shapes can be alternatively observed by tracking the pipe’s volume. For this purpose, Fig. 20 (b) is presented, in which the relative volume changes (i.e., the sum of all FVs) over the number of shape changes are depicted for all approaches. The LB-based methods require about 55% relative volume increase to achieve roughly 43% relative cost functional reduction. On the other hand, the SP-WD approach converts relative volume change of approximately 40% almost directly into a relative objective decrease of also 40%. Only the PHD and DS approaches reduce the cost functional significantly more than the volume increase. Thus, the PHD [DS] approach gained about 36% [26%] relative objective decrease with about 25% [17%] relative volume increase.

Due to the increased computational effort required for this study compared to the two-dimensional example shown previously, it is interesting to compare the methods with respect to computation time. Such a comparison is given in Tab. 3, distinguishing between mean primal and mean adjoint computation time. For the underlying process, the mesh is adjusted before each primal simulation and thus, the averaged primal time consists of the time required to compute the shape update and the solution to the primal Navier-Stokes system (51)-(52). In all cases, the average adjoint simulation time is in the range of 0.1 CPUh. Interestingly, the values of the optimizations based on the Laplace-Beltrami approach are slightly below while all others slightly above this value. Starting from an approximately similar simulation time of all primal NS approximations, a significant increase in computation time can be seen for the volume-based methods. Therein, the PHD approach is particularly costly, since the nonlinear equation character in (39) is elaborately iterated in terms of Picard linearization, which drastically increases the total simulation time.

5.3 Discussion

Overall, the numerical studies shown herein highlight how different shape updates on the same CFD-based optimization problem impact not only the steepness of the objective reduction curve but also the final shape. From a practical point of view, we identified mesh quality preservation to be the bottleneck of the applied approaches. Indeed, one can sustain a better mesh quality or even progress the optimization of non-converged runs by auxiliary techniques, such as remeshing or additional artificial smoothing, however, this goes beyond the scope of the paper. Furthermore, it is of interest to note that the computational cost for each shape update is not the same but rather increases when the complexity of the utilized shape update increases as well. Finally, based on the presented results, we would like to emphasize that the intention is not to enable a direct comparison of different shape updates with regard to performance in general. Rather we would like to show how a range of practical shape updates may result in different shapes because typically, the optimization runs have to be stopped before an optimal shape is reached due to mesh distortion issues. Which shape update yields the largest reduction until the mesh becomes heavily deteriorated depends on the application. For example, by comparing the applications presented herein, one can notice that VLB performs much better in the double-bent pipe than in the cylinder case.

6 Summary and conclusion

We have explained six approaches to compute a shape update based on a given sensitivity distribution in the scope of an iterative optimization algorithm. To this end, we elaborated on the theory of shape spaces and Riemannian shape gradients from a mathematical perspective, before introducing the approaches from an engineering perspective. We included two variants of the well known Hilbertian approaches based on the Laplace-Beltrami operator that yield first order Sobolev gradients (SLB and VLB). For comparison, a discrete filtering technique and a direct application of the sensitivity was considered as well (FS and DS). Further, two alternative approaches that have not yet been extensively used for engineering applications were investigated (SP and PHD). They directly yield the domain update direction, such that an extra step that extends the shape update direction into the domain can be avoided.

Based on an illustrative example, the characteristic behavior of the approaches was shown. While the FS and the DS approach manage to find the optimal shape even in regions where it is not smooth or has a high curvature, the SP approaches yields shapes which differ in these regions. For the PHD, VLB and SLB approach, the parameters $p$ and $A$ can be used to regulate the smoothness of the obtained shape. Due to the possibility of remeshing for the comparably simple problem, mesh quality was not an issue.

Regarding the simulations of the CFD problems, for which remeshing was not realized, the decrease in mesh quality became a severe issue preventing the optimization algorithm from convergence. For the two-dimensional case, the PHD approach yielded the steepest decrease in the objective functional, however, the smallest objective functional value was obtained using the SP method, which managed to preserve a reasonable mesh quality for more iterations than all other approaches. For the three-dimensional case, the VLB and SLB approaches outperformed all other approaches in terms of steepest decrease of the objective functional as well as the smallest value that could be achieved before the mesh quality became critical.

Concluding, we have observed that the behavior of the approaches is strongly connected to the considered problem. We suggest to use the SP as a first choice, as it is computationally less involved than the PHD approach and does not require an extension of the shape update into the domain in a second step like the SLB and the VLB approach. The performance of the latter shall still be compared for a given application scenario – despite the extension in a separate step, the overall computational cost may still be reduced compared to the SP approach due to a steeper descent. Finally, we suggest not to use the DS approach, since it was weaker than all other approaches in terms of mesh quality, irrespective of the problem.

Acknowledgment

The current work is a part of the research training group ’Simulation-Based Design Optimization of Dynamic Systems Under Uncertainties’ (SENSUS) funded by the state of Hamburg within the Landesforschungsförderung under project number LFF-GK11.

The authors gratefully acknowledge the computing time made available to them on the high-performance computers Lise and Emmy at the NHR centers ZIB and Göttingen. These centers are jointly supported by the Federal Ministry of Education and Research and the state governments participating in the NHR (www.nhr-verein.de/unsere-partner).

Contribution

Lars Radtke: Conceptualization, Software, Formal analysis, Investigation, Writing – Original Draft (Sec. 1, 3, 4, 6), Writing – Review & Editing, Visualization, Project Administration. Georgios Bletsos, Niklas Kühl: Conceptualization, Software, Formal analysis, Investigation, Writing – Original Draft (Sec. 5), Writing – Review & Editing, Visualization. Tim Suchan: Conceptualization, Formal analysis, Writing – Original Draft (Sec. 2), Writing – Review & Editing Kathrin Welker: Conceptualization, Formal analysis, Writing – Review & Editing, Supervision, Project administration, Funding acquisition. Thomas Rung, Alexander Düster: Writing – Review & Editing, Supervision, Project administration, Funding acquisition.

Bibliography103

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] P.-A. Absil, R. Mahony, and R. Sepulchre. Optimization algorithms on matrix manifolds . Princeton University Press, Princeton, NJ, 2008.
2[2] G. Allaire, C. Dapogny, and F. Jouve. Shape and topology optimization. In A. Bonito and R. Nochetto, editors, Geometric partial differential equations, part II . Elsevier, 2021.
3[3] G. Allaire, F. Jouve, and A. Toader. Structural optimization using sensitivity analysis and a level-set method. Journal of Computational Physics , 194(1):363–393, 2004.
4[4] L. Ambrosio, N. Gigli, and G. Savaré. Gradient flows with metric and differentiable structures, and applications to the Wasserstein space. Rendiconti Lincei. Matematica e Applicazioni , 15(3-4):327–343, 2004.
5[5] H. Azegami and K. Takeuchi. A smoothing method for shape optimization: Traction method using the Robin condition. International Journal of Computational Methods , 3(01):21–33, 2006.
6[6] H. Azegami and Z. Wu. Domain optimization analysis in linear elastic problems: approach using traction method. JSME International Journal. Ser. A, Mechanics and Material Engineering , 39(2):272–278, 1996.
7[7] M. Bauer. Almost Local Metrics on Shape Space . Doctoral thesis, Universität Wien, 2010.
8[8] M. Bauer, M. Bruveris, and P. Michor. Overview of the geometries of shape spaces and diffeomorphism groups. Journal of Mathematical Imaging and Vision , 50(1-2):60–97, 2014.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Parameter-free shape optimization: various

Abstract

1 Introduction

2 Shape spaces, metrics and gradients

2.1 Definition of shapes

2.2 The concept of shape spaces

2.3 Metrics on shape spaces

Example

2.4 Riemannian shape gradients

2.5 Examples of shape spaces and their use for shape optimization

The shape space Be\mathcal{B}_{e}Be​

The shape space B12\mathcal{B}^{\frac{1}{2}}B21​

The largest-possible space of bi-Lipschitz transformations W1,∞(Ω,Rd)W^{1,\infty}(\Omega,\mathbb{R}^{d})W1,∞(Ω,Rd)

3 Parameter-free shape optimization in engineering

3.1 Shape and domain update approaches

3.1.1 Discrete filtering approaches

3.1.2 Laplace-Beltrami approaches

3.1.3 Steklov-Poincaré approaches

3.1.4 ppp-harmonic descent approach

3.2 Mesh morphing and step size control

4 Illustrative test case

4.1 Discretization

4.2 Results

4.2.1 Behavior under mesh refinement

4.2.2 Behavior for a non-smooth initial shape

5 Exemplary applications

5.1 Two-dimensional flow around a cylinder

5.1.1 Results

5.1.2 Step size control through line search

5.2 Three-dimensional flow through a double-bent duct

5.2.1 Results

5.3 Discussion

6 Summary and conclusion

Acknowledgment

Contribution

The shape space $\mathcal{B}_{e}$

The shape space $\mathcal{B}^{\frac{1}{2}}$

The largest-possible space of bi-Lipschitz transformations $W^{1,\infty}(\Omega,\mathbb{R}^{d})$

3.1.4 $p$ -harmonic descent approach