A mesh-free method for interface problems using the deep learning approach
Zhongjian Wang, Zhiwen Zhang

TL;DR
This paper introduces a mesh-free deep learning method for solving interface problems involving PDEs with discontinuities, offering an easy-to-implement alternative to traditional mesh-based techniques.
Contribution
It develops a novel mesh-free deep learning framework for interface problems, handling discontinuities without adaptive meshes or special basis functions.
Findings
Demonstrates high accuracy in solving interface PDEs
Shows efficiency and ease of implementation
Validates approach with numerical experiments
Abstract
In this paper, we propose a mesh-free method to solve interface problems using the deep learning approach. Two interface problems are considered. The first one is an elliptic PDE with a discontinuous and high-contrast coefficient. While the second one is a linear elasticity equation with discontinuous stress tensor. In both cases, we formulate the PDEs into variational problems, which can be solved via the deep learning approach. To deal with the inhomogeneous boundary conditions, we use a shallow neuron network to approximate the boundary conditions. Instead of using an adaptive mesh refinement method or specially designed basis functions or numerical schemes to compute the PDE solutions, the proposed method has the advantages that it is easy to implement and mesh-free. Finally, we present numerical results to demonstrate the accuracy and efficiency of the proposed method for interface…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A mesh-free method for interface problems using the deep learning approach
Zhongjian Wang
Zhiwen Zhang
Department of Mathematics, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China.
Abstract
In this paper, we propose a mesh-free method to solve interface problems using the deep learning approach. Two interface problems are considered. The first one is an elliptic PDE with a discontinuous and high-contrast coefficient. While the second one is a linear elasticity equation with discontinuous stress tensor. In both cases, we formulate the PDEs into variational problems, which can be solved via the deep learning approach. To deal with the inhomogeneous boundary conditions, we use a shallow neuron network to approximate the boundary conditions. Instead of using an adaptive mesh refinement method or specially designed basis functions or numerical schemes to compute the PDE solutions, the proposed method has the advantages that it is easy to implement and mesh-free. Finally, we present numerical results to demonstrate the accuracy and efficiency of the proposed method for interface problems.
AMS subject classification: 35J20, 35R05, 65N30, 68T99, 74B05.
keywords:
Deep learning; variational problems; mesh-free method; linear elasticity; high-contrast; interface problems.
1 Introduction
In recent years, deep learning methods have achieved unprecedented successes in various application fields, including computer vision, speech recognition, natural language processing, audio recognition, social network filtering, and bioinformatics, where they have produced results comparable to and in some cases superior to human experts [17, 12]. Motivated by these exciting progress, there are increased new research interests in the literature for the application of deep learning methods for scientific computation, including approximating multivariate functions and solving differential equations using the deep neural network; see [13, 20, 27, 28, 15, 32] and references therein.
In [13], the authors investigate the relationship between deep neural networks with rectified linear unit (ReLU) function as the activation function and continuous piecewise linear functions in the finite element method (FEM). A new error bound for the approximation of multivariate functions using deep ReLU networks is presented in [20], which shows that the curse of the dimensionality is lessened by establishing a connection between the deep networks and sparse grids. In [28] the authors solve Poisson problems and eigenvalue problems in the context of the Ritz method based on representing the trail functions by deep neural networks. Meanwhile, in [27] the authors propose deep learning-based numerical methods for solving high-dimensional parabolic partial differential equations and backward stochastic differential equations. In [15], a neural network was proposed to learn the physical quantity of interest as a function of random input coefficients; the accuracy and efficiency of the approach for solving parametric PDE problems was shown. In [32], the authors propose a Bayesian approach to develop deep convolutional encoder-decoder networks, which give surrogate models for uncertainty quantification and propagation in problems governed by stochastic PDEs. In [26], the authors design multi-layer neural network architectures for multiscale simulations of flows that takes into account the observed data and physical modeling concepts. In [24], the authors estimate the expressive power of a class of deep Neural Networks on a class of countably-parametric maps. Those maps arise as response surfaces of parametric PDEs with distributed uncertain inputs.
In this paper, we investigate the deep learning approach to solve interface problems, which have many application in physical and engineering sciences. For example, to model the heterogeneous porous medium in the reservoir simulation, the permeability field is often assumed to be a multiscale function with high-contrast and discontinuous features. Another example is to study the evolution of the shape and location of fibroblast cells under stress [31]. The model is based on ideas of a continuum mechanical description of stress-induced phase transitions, where the cell is modeled as a transformed inclusion in a linear elastic matrix and the cell surface evolves according to a special kinetic relation. In this model, the stress tensor has discontinuity across the cell surface due to the transformation in the strain tensor caused by contraction in the cell.
There has been a lot of effort in developing accurate and efficient finite element methods (FEMs) for interface problems. In [19, 11], Li et.al. developed the immersed-interface finite element method to solve elliptic interface problems with non-homogeneous jump conditions. Their method considered uniform triangular grids and approximated the interface by a straight line segment when it intersects a coarse grid element. By matching the jump condition, they created a special basis function for elements which were cut through by the interface and proved a second order convergence rate in the norm and a first order convergence rate in the semi-norm. However, the constants in their error estimate depend on the contrast of the coefficient. In [6], Hou et.al. developed a new multiscale finite element method which was able to accurately capture solutions of elliptic interface problems with high-contrast coefficients by using only coarse quasi-uniform meshes, and without resolving the interfaces. Moreover, they provided optimal error estimate in the sense that the hidden constants in the estimates were independent of the contrast of the PDE coefficients. Much earlier, Babuška [2] studied the convergence of methods based on a minimization problem equivalent to elliptic PDEs with discontinuous coefficients, in which the boundary and jump condition were incorporated in the cost functions. In [5], Chen and Zou approximated the smooth interface by a polygon and used classical finite element methods to solve both elliptic and parabolic interface equations, where the mesh must align with the interface.
Alternatively, some efficient finite difference methods (FDMs) were proposed to solve interface problems. In [21], Peskin developed the immersed boundary method (IBM) to study the motion of one or more massless, elastic surfaces immersed in an incompressible, viscous fluid, particularly in bio-fluid dynamics problems where complex geometries and immersed elastic membranes are present. The IBM method employs a uniform Eulerian grid over the entire domain to describe the velocity field of the fluid and a Lagrangian description for the immersed elastic structure. We refer to [22] for an extensive review of this method and its various applications. Another related work is the immersed interface method (IIM) for elliptic interface problems developed by LeVeque and Li [18]. By incorporating the jump condition across the interface to modify the finite difference approximation near the interface, a second order accuracy was maintained. An important development of interface capturing methods is the ghost fluid method (GFM) developed by Osher et.al.[10], which incorporated the interface jump condition into the finite difference discretization by tracking the interface with a level set function. The GFM has been applied to capture discontinuities in multi-medium compressible multiphase flows.
In this paper, we are interested in developing numerical methods to solve interface problems in a mesh-free manner. Our work is inspired by the deep Ritz method proposed in [28], where the Poisson problems and eigenvalue problems were studied. We intend to investigate the expressive power of the deep neural networks in representing solutions of interface problems. Two typical interface problems are considered. The first one is an elliptic PDE with a discontinuous and high-contrast coefficient, which is a challenging problem and has been intensively studied; see [3, 19, 6, efendiev2011multiscale]. The second one is a linear elasticity equation with discontinuous stress tensor [31].
In both problems, we formulate the PDEs into variational problems, which can be solved using the deep learning approach. Then, we use the stochastic gradient descent (SGD) method to solve the variational problem. To impose inhomogeneous boundary conditions, we propose to use a shallow neuron network to approximate the boundary conditions. We find that the proposed method is easy to implement and mesh-free since we do not need to choose an adaptive mesh to discretize the PDEs. Our numerical results show that the proposed method can efficiently solve the interface problems. Moreover, we observe that the convergence time of the SGD method is random, which may be due to the fact that the iteration process of the SGD method can be get stuck into some local minimums. Especially, we find that it takes a longer time to get out of local minimums in a ‘harder’ case of the high-contrast problem; see Section 5.1 for more details.
The rest of the paper is organized as follows. In Section 2, we shall review the basic ideas of deep neural network and the idea of the deep Ritz method. In Section 3, we propose the formulation of the deep learning method in solving interface problems. We also discuss the issues regarding the implementation of the proposed method, including how to impose inhomogeneous boundary conditions. In Section 4, we present numerical results to demonstrate the accuracy of our method. Concluding remarks will be made in Section 5.
2 Some preliminaries
In this section, we briefly discuss the definition and properties of the deep neural network (DNN), including its approximation property and then the formulation of the deep Ritz method [28].
2.1 The DNN and its approximation property
There are two ingredients in defining a DNN. The first one is a (vector) linear function of the form , defined as , where , and . The second one is a nonlinear activation function . A frequently used activation fucntion, known as the rectified linear unit (ReLU), is defined as [17]. In the artificial neural network literature, the Sigmoid function is another frequently used activation function, which is defined as . By applying the activation function in an element-wise manner, one can define (vector) activation function .
Equipped with those definitions, we are able to define a continuous function by a composition of linear transforms and activation functions, i.e.,
[TABLE]
where with be undetermined matrices and be undetermined vectors, and is the element-wisely defined activation function. Dimensions of and are chosen to make (1) meaningful. Such a DNN is called a -layer DNN, which has hidden layers. Denoting all the undetermined coefficients (e.g., and ) in (1) as , where is a high dimensional vector and is the space of . The DNN representation of a continuous function can be viewed as
[TABLE]
Let denote the set of all expressible functions by the DNN parametrized by . Then provides an efficient way to represent unknown continuous functions, comparing with a linear solution space used in classic numerical methods, e.g., a trial space spaced by linear nodal basis functions in the FEM. In the sequel, we shall discuss the approximation property of the DNN, which is relevant to the study of the expressive power of a DNN model [7, 24].
Early studies of approximation properties of neural network can be found in [8, 14], where the authors studied approximation properties for the function classes given by a feed-forward neural network with a single hidden layer. Later, many authors studied the error estimates for such neural networks in terms of number of neurons, layers of the network, and activation functions; see [9, 23] for a good review of relevant works.
In recent years, the DNN has shown successful applications in a broad range of problems, including classification for complex systems and construction of response surfaces for high-dimensional models. Significant efforts have been devoted to study the benefits on the expressive power of NNs afforded by NN depth. For example, in [7], the authors proved that convolutional DNNs were able to express multivariate functions given in so-called Hierarchic Tensor (HT) formats. In [30], the author studied the expressive power of shallow and deep neural networks with piece-wise linear activation functions and established new rigorous upper and lower bounds for the network complexity in approximating Sobolev spaces.
In [13], the authors studied the relationship between DNNs with ReLU function as the activation function and continuous piecewise linear functions from the linear FEM. They proved the following statement.
Proposition 2.1**.**
Given a locally convex finite element grid , any linear finite element function with degrees of freedom, can be written as a ReLU-DNN with at most hidden layers and at most number of the neurons, where denotes the maximum number of neighboring elements of one node.
The Prop.2.1 provides upper bounds in setting the number of hidden layers and number of neurons within each layer, when one uses the DNN to approximate the solution space spanned by the FEM basis. In our numerical results, we find that choosing a relatively small number of hidden layers and neurons are good enough to obtain accurate numerical results.
2.2 Formulation of the deep Ritz method
The deep Ritz method is a deep learning based numerical method for solving variational problems [28]. Therefore, it naturally can be used to solve PDEs. For example, we consider a Poisson equation defined on a compact domain ,
[TABLE]
Given the Poisson equation (3), we can derive the corresponding variational problem as
[TABLE]
Then, the solution of (3) can be obtained by,
[TABLE]
From the perspective of scientific computing, the Poisson equation (3) can be solved using numerical methods, such as FDMs and FEMs. From the perspective of machine learning however, the numerical solution of is interpreted as a function with as its input and as its output, where denotes the dimension the physical domain . Thus, it can be approximated by in (1).
Let denote the DNN representation of the solution of the Poisson equation. Substituting into the variational problem (4), we get the optimization problem
[TABLE]
where is a subspace of that satisfies the boundary condition on and it may have some limitations on imposing boundary conditions. The justification of this assumption will be discussed later.
After parameterizing the expressible function space by , we equivalently define the variational problem (4) as
[TABLE]
The variational problems (7) is not convex in general even when the original variational problem (4) is. In other word, the variational problem (4) is convex with respect to the solution , however, the variational problem (7) is non-convex with respect to the parameters in the DNN. Obviously, the issue of local minima and saddle points is nontrivial, which brings essential challenges to many existing optimization methods.
Since the parameter space is typically very large, one usually uses the stochastic gradient descent (SGD) method [4] to solve (7). There are plenty of optimization methods to search among the large parameter space. To accelerate the training of the neural network, we use the Adam optimizer version of the SGD [16].
To impose boundary conditions is an important issue in the DNN representation. In the homogeneous Dirichlet problem (3), a relaxation approach was proposed to address this issue. Specifically, one adds a soft constraint (a boundary integral term) to the functional defined in (7) and obtains
[TABLE]
Notice that the soft constraint term will approach zero when we decrease the parameter in the calculation. Therefore, the homogeneous boundary condition is satisfied in a certain weak scene.
3 Inhomogeneous boundary condition
As an extension to the deep Ritz method, we consider to solve the inhomogeneous Dirichlet problem as follows
[TABLE]
where is a linear PDE operator, is a source function, and is a boundary condition. Let denote the Lagrangian form associated with the homogeneous Dirichlet problem of (9), i.e., ; see (4) for instance.
To deal with the inhomogeneous boundary condition in (9), we first choose a shallow neuron network to approximate the boundary condition . Let denote the approximation of using the neuron network, which is defined on whole domain . However, only boundary values of are used, so it can be obtained by solving the following optimization problem
[TABLE]
where denotes the set of all expressible functions by a shallow neuron network. The optimization problem (10) can be approximated by,
[TABLE]
where and is the number of sample points. In real application, uniform sampler of is not necessary. One can change the integrand of (10) by multiplying the Radon-Nikodym derivative of the sampler’s distribution. Once we obtain a sampler whose distribution is absolutely continuous w.r.t Lebesgue measure of , we can still minimizing (11) to obtain .
In our proposed approach, reasons of choosing a shallow network to approximate are twofold. First, plays as the role of an initial guess to the inhomogeneous boundary condition. As explained above, only the values of on will be used, so limited parameters of will be good enough. This helps shorten the training of . Second, due to the simple structure of , the term in will not oscillate in (especially in the weak form), which leads to a faster convergence in solving optimization problems.
Fig.2 and Fig.2 show the network layouts for approximating and , respectively, where denotes the width of each hidden layer. For example, Layer 2 in Fig.2 is in . To be more precise, denote Layer 1 to be , Layer 2 to be , then,
[TABLE]
where is a matrix and is a vector to be determined.
Since the neuron network that is used to represent is shallow, i.e., is represented by a composition of smooth functions, is expressible. Then, we solve an auxiliary PDE as follows,
[TABLE]
Now the problem (13) becomes a homogeneous Dirichlet problem, which can be solved using the deep Ritz approach; see Section 2.2. Finally, the solution of the inhomogeneous Dirichlet problem (9) can be represented as .
4 Derivation of the methodology
4.1 Elliptic PDEs with discontinuous and high-contrast coefficients
We first consider elliptic PDEs with discontinuous coefficients defined as follows,
[TABLE]
where is a bounded spatial domain and the boundary of is a convex polygon. For notation simplification, we first study a homogeneous Dirichlet problem. The elliptic PDEs with inhomogeneous boundary conditions can be solved by using the approach studied in Section 3.
The coefficient is assumed to be a scalar and has jumps across a number of smooth interior interfaces. Denoting the inclusions by ,…, and setting , we assume that the coefficient is piecewise constant with respect to the decomposition . Setting and dividing (14) by , we rescale the problem. Specifically, let denote the re-scaled coefficient, which is piecewise constant with respect to the partition and for all . Letting denote the restriction of to , we are interested in studying two types of high-contrast cases,
[TABLE]
for some positive constant . In Case 1, the inclusions are high permeability compared to the background, while the Case 2 contains the converse configuration.
Now, we are in the position to derive the formulation of deep learning approach to solve the elliptic PDEs (14)(15) with high-contrast coefficients (16) (17). We define the corresponding variational problem as
[TABLE]
Then, the solution of (14)(15) can be obtained by , where is defined in (18). Again, we denote the set of all expressible function by and set \mathbb{F}_{0}=\{F\in\mathbb{F}\big{|}F|_{\partial D}=0\}. Moreover, let denote the parameter set satisfies the homogeneous boundary condition, i.e., , . The approximation property of the DNN implies that . Therefore, we represent the solution to Eq.(14) using the DNN method.
Let denote the DNN representation; see Eq.(1). Then, satisfies the following variational problem
[TABLE]
Since the degree of freedom in the variational problem (19) is quite large, we apply the SGD method on the parameter space to solve it. As such, we approximate gradient of one parameter by,
[TABLE]
where are randomly sampled from the physical domain , is the volume of the domain, and is called batch number in the context of deep learning (meaning the number of training examples utilized in one iteration). Notice that is a high-dimensional vector and is any component of . After we get the approximation of the gradient with respect to , we can update each component of as
[TABLE]
where is the learning rate. To accelerate the training of the neural network, we use the Adam optimizer version of the SGD method [16].
Remark 4.1*.*
From the derivation of the DNN formulation, one can see that the proposed method automatically deals with the interface condition (or discontinuous coefficients) without knowing locations of the interfaces a-priori.
4.2 Linear elasticity with discontinuous stress tensors
In this subsection, we consider the DNN approach to solve linear elasticity interface problems. One application of the linear elasticity problem is to model the shape and location of fibroblast cells under stress [31]. The model is based on the idea of a continuum mechanical description of stress-induced phase transitions. To demonstrate the main idea, we consider a two-dimensional linear elasticity problem.
Suppose the matrix (meaning the material or tissue in cells) plus the cell together occupy a bounded domain , and is composed of linear elastic homogeneous isotropic material. We assume the cell has small deformations, so that the linearized theory of elasticity is used. Let denote the displacement field. Then, the strain tensor is
[TABLE]
In the matrix except the cell, the stress tensor is related to the strain tensor (gradient of the displacement) by , where the elasticity tensor is a linear transformation on the tensors. In the isotropic case, we have
[TABLE]
for any two dimensional matrix A. In Eq.(23), and are lamé constants, is the trace operator, and 1 is the identity matrix. In components, the action of the elasticity tensor reads
[TABLE]
where the Einstein summation convention is used.
The cell is modeled by a compact region with smooth boundary; see Fig.7. Let denote a transformation strain, which is a constant symmetric matrix. We assume the stress tensor has a jump across the cell, i.e.,
[TABLE]
In our cell model, we set the transformation strain to be a contraction, which is represented by an isotropic compression with . We suppose the cell model is in a quasi-static state. Therefore, the displacement field u satisfies the following linear elasticity PDE with a discontinuous stress tensor in a weak sense,
[TABLE]
where is the characteristic function of the cell domain and is a constant symmetric matrix, which measures the effect on the cell boundary due to the contraction. We impose Dirichlet boundary conditions on . On the cell boundary , the solution u satisfies the following jump conditions
[TABLE]
where is the outward unit normal vector on and denotes the jump across the interface.
Then, the linear elasticity interface problem (28)-(29) can be computed by numerical methods, such as the immersed interface method [29] or matched interface and boundary method [25]. However, the implementation of the numerical scheme is not simple due to the jump conditions on the interface, especially when the interface has a complicated geometry.
In the sequel, we shall develop the formulation of solving the linear elasticity interface problem (28)(29) using the DNN method. In the isotropic case, let , where and is a vector valued function. Then, (28) is equivalent to,
[TABLE]
Then, the variational problem associated with (30) is given by,
[TABLE]
where denotes the inner product between matrices, i.e., . Finally, the solution of (30) can be obtained by , where is defined in (31). The remaining implementation of the DNN method for (31) is exactly the same as we discussed in Section 4.1, so we skip the details here.
5 Numerical Example
In this section, we shall carry out numerical experiments to demonstrate the performance of the DNN method in solving interface problems. In addition, we are interested in understanding the SGD method in solving the non-convex optimization problem. The TensorFlow [1] provides an efficient tool to calculate the partial derivatives in (20), which will be used in our implementation.
5.1 2D high-contrast elliptic problems
We consider 2D elliptic PDEs with high-contrast coefficients defined as follows,
[TABLE]
where , the domain is , and the coefficient is a piecewise constant defined by
[TABLE]
where and . Moreover, the source term and the boundary condition . We choose the source term and boundary condition in such a way that the exact solution (in the polar coordinate) is
[TABLE]
In our first experiment, we choose and in (34); see Fig.3 for the profile of the coefficient. Notice that problem (32)(33) is an inhomogeneous Dirichlet problem. We use the immersed-interface FEM with fine mesh to compute the reference solution and the DNN method to compute the numerical solution. The implementation of the DNN method has been intensively discussion in Section 3 and Section 4.1. The network that we used is illustrated in Fig.2 and Fig.2, which has 4 intermediate layers with width 15 to approximate and has 3 intermediate layers with width 10 to approximate . The network is not specially designed for the target problem. Expressibility of DNN discussed in Sec.2.1 assures adequate approximation to the solution by adjusting the width of each intermediate layer. In the learning process, i.e., the running of the SGD method, we choose the batch number (number of samples per gradient update) to be (that contains points in the interior domain of and points on the boundary , which is used to evaluate second term in (8)) and generate a new batch every steps of updating. And the learning rate is . Once we have a uniform sampler, the network automatically deals with the interface without knowing locations of the interface a-priori.
In Fig.4, we show the corresponding numerical results. In Fig.4(a) and Fig.4(b), we plot the profiles of a shallow network approximation of the boundary condition and the deep network approximation of solution to the auxiliary PDE (13), respectively. In Fig.4(d) and Fig.4(e), we show the comparison between the DNN solution and the reference solution. One can see that the DNN method provides an accurate result for this interface problem.
In Fig.4(c) and Fig.4(f), we plot the decay of the Lagrangian and the relative error between the DNN solution and reference solution during the training process. Interestingly we observe that optimization process gets stuck at a local minimum at the beginning, i.e., the first four thousand steps, where the Lagrangian functional does not have decay and the error between the DNN solution and reference solution keeps as a constant. Beyond that the optimization process jumps out the local minimum, which make the Lagrangian functional and the error continue to decay. Finally the error oscillates around 5%.
In our second experiment, we choose and in (34). The profile of the new coefficient looks like an upside down of the profile shown in Fig.3. We do not show it here. Again, we use the immersed-interface FEM with fine mesh to compute the reference solution and the DNN method to compute the numerical solution. The setting of the DNN method is the same as the first experiment.
In Fig.5, we show the corresponding numerical results. In Fig.5(a) and Fig.5(b), we plot the profiles of a shallow network approximation of the boundary condition and the deep network approximation of solution to the auxiliary PDE (13), respectively. In Fig.5(d) and Fig.5(e), we show the comparison between the DNN solution and the reference solution. The DNN method also provides an accurate result for this interface problem.
In Fig.5(c) and Fig.5(f), we plot the decay of the Lagrangian and the relative error between the DNN solution and reference solution during the training process. We find that the decay pattern of the second experiment is different from the first one. The Lagrangian functional has instant fluctuations during the optimization process. However, it does not get stuck at a local minimum. The error function is a monotonic decreasing function. Finally the error is reduced to about 2%.
The DNN method is a probabilistic method since the initial value of parameters in the network, i.e. and the Adams SGD optimizer are random. We are interested in investigating the convergence speed when and , which is a ‘harder’ case of the high-contrast problem since the optimization process of the DNN method gets stuck at a local minimum. In Fig.6, we show results of the convergence speed study when and , respectively. Specifically, we plot the histogram of the number of steps to converge. The total number of iteration is when and when . We find that a higher contrast in the coefficient will lead to a slower convergence in the DNN method. We also find that about of trials failed to converge within the designed steps.
5.2 2D Linear elasticity interface problem
We consider a linear elasticity PDE with a discontinuous stress tensor as follows,
[TABLE]
where , the domain , , the elasticity tensor is defined by (23) or (24) with and .
In the cell model [31], keratocytes typically have a roughly circular shape with an annular lamellipodium surrounding the nucleus, when they are in stationary state. Contact and force transmission with the substrate occurs only at the lamellipodium and not the nucleus and organelles. Accordingly, we choose the initial lamellipodium region to be an annulus in the center of the square domain , with the nucleus excluded; see Fig.7.
We set on the boundary of , which gives a null displacement or traction-free boundary condition. On the boundary of the cell , we impose the jump conditions (29).
We use the immersed-interface FEM with a fine mesh to compute the reference solution and the DNN method to compute the numerical solution. The network maps to which used 4 intermediate layers. The width of each layer is 20 and layout is same with Fig.2. In the running of the SGD method, we choose the batch number to be and generate a new batch every steps of updating. And the learning rate is .
In Fig.8, we show the corresponding numerical results. In Fig.8(a) and Fig.8(b), we plot the profiles of DNN solutions and , which are the displacements in and coordinates, respectively. The corresponding reference solutions are shown in Fig.8(d) and Fig.8(e). We find that the DNN solutions agree well with the reference solutions. In Fig.8(c) and Fig.8(f), we plot the decay of the Lagrangian and the relative error between the DNN solution and reference solution during the training process. We find that the decay pattern of the third experiment is same as the second one. Finally the error is reduced to about 4%. Our numerical results imply that the DNN method is efficient in solving the 2D Linear elasticity interface problem (36). Most importantly, its implementation is very simple.
6 Conclusions
In this paper, we studied the deep-learning based method to solve interface problems. By formulating the PDEs into variational problems, we convert the interface problems into optimization problems. Since the DNN can be used to approximate the linear space spanned by FEM nodal basis functions. Thus, we parameterize the PDE solutions using the DNN and solve the interface problems by searching the minimizer of the associated optimization problems. Although the parameter space of the DNN is huge, the SGD method can be applied to solve the optimization problems efficiently. In this framework, once we have samplers of grids on the domain and the boundary, we do not need any special treatment to deal with the interface inside the domain. Therefore, the proposed method is easy to implement and mesh-free. Finally, we present numerical experiments to demonstrate the performance of the proposed method. Specifically, we use the DNN method to solve elliptic PDEs with discontinuous and high-contrast coefficients and linear elasticity with discontinuous stress tensors. We find the the DNN method gives accurate results for both experiments. There are several issues remain open. For instance, we do not get the convergence rate for the DNN method and we have little understanding about the parameter space of the DNN. In addition, the issue of local minima and saddle points in the optimization problem is highly nontrivial. We are interested in studying these issues in our future research.
Acknowledgements
The research of Z. Wang is partially supported by the Hong Kong PhD Fellowship Scheme. The research of Z. Zhang is supported by Hong Kong RGC grants (Projects 27300616, 17300817, and 17300318), National Natural Science Foundation of China (Project 11601457), Seed Funding Programme for Basic Research (HKU), an RAE Improvement Fund from the Faculty of Science (HKU), and the Hung Hing Ying Physical Sciences Research Fund (HKU). The computations were performed using the HKU ITS research computing facilities that are supported in part by the Hong Kong UGC Special Equipment Grant (SEG HKU09). We would like to thank Professor Thomas Hou for stimulating discussions.
References
- [1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al.
Tensorflow: a system for large-scale machine learning.
In OSDI, volume 16, pages 265–283, 2016.
- [2]
I. Babuška.
The finite element method for elliptic equations with discontinuous coefficients.
Computing, 5(3):207–213, 1970.
- [3]
C. Bernardi and R. Verfürth.
Adaptive finite element methods for elliptic equations with non-smooth coefficients.
Numerische Mathematik, 85(4):579–608, 2000.
- [4]
L. Bottou.
Large-scale machine learning with stochastic gradient descent.
In Proceedings of COMPSTAT’2010, pages 177–186. Springer, 2010.
- [5]
Z. Chen and J. Zou.
Finite element methods and their convergence for elliptic and parabolic interface problems.
Numerische Mathematik, 79(2):175–202, 1998.
- [6]
C. Chu, I. Graham, and T. Y. Hou.
A new multiscale finite element method for high-contrast elliptic interface problems.
Math. Comp., 79:1915–1955, 2010.
- [7]
N. Cohen, O. Sharir, and A. Shashua.
On the expressive power of deep learning: A tensor analysis.
In Conference on Learning Theory, pages 698–728, 2016.
- [8]
G. Cybenko.
Approximation by superpositions of a sigmoidal function.
Mathematics of control, signals and systems, 2(4):303–314, 1989.
- [9]
S. Ellacott.
Aspects of the numerical analysis of neural networks.
Acta Numerica, 3:145–202, 1994.
- [10]
R. Fedkiw, T. Aslam, B. Merriman, and S. Osher.
A non-oscillatory eulerian approach to interfaces in multimaterial flows (the ghost fluid method).
Journal of computational physics, 152(2):457–492, 1999.
- [11]
Y. Gong, B. Li, and Z. Li.
Immersed-interface finite-element methods for elliptic interface problems with nonhomogeneous jump conditions.
SIAM Journal on Numerical Analysis, 46(1):472–495, 2008.
- [12]
Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio.
Deep learning, volume 1.
MIT press Cambridge, 2016.
- [13]
J. He, L. Li, J. Xu, and C. Zheng.
Relu deep neural networks and linear finite elements.
arXiv:1807.03973, 2018.
- [14]
K. Hornik, M. Stinchcombe, and H. White.
Multilayer feedforward networks are universal approximators.
Neural networks, 2(5):359–366, 1989.
- [15]
J. Khoo, Y.and Lu and L. Ying.
Solving parametric pde problems with artificial neural networks.
arXiv:1707.03351, 2017.
- [16]
D. Kingma and J. Ba.
Adam: A method for stochastic optimization.
arXiv:1412.6980, 2014.
- [17]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton.
Deep learning.
nature, 521(7553):436, 2015.
- [18]
R. Leveque and Z. Li.
The immersed interface method for elliptic equations with discontinuous coefficients and singular sources.
SIAM Journal on Numerical Analysis, 31(4):1019–1044, 1994.
- [19]
Z. Li, T. Lin, and X. Wu.
New cartesian grid methods for interface problems using the finite element formulation.
Numerische Mathematik, 96(1):61–98, 2003.
- [20]
H. Montanelli and Q. Du.
New error bounds for deep ReLU networks using sparse grids.
arXiv:1712.08688, 2018.
- [21]
C. Peskin.
Numerical analysis of blood flow in the heart.
Journal of computational physics, 25(3):220–252, 1977.
- [22]
C. Peskin.
The immersed boundary method.
Acta numerica, 11:479–517, 2002.
- [23]
A. Pinkus.
Approximation theory of the MLP model in neural networks.
Acta numerica, 8:143–195, 1999.
- [24]
C. Schwab and J. Zech.
Deep learning in High Dimension.
Research Report, 2017, 2017.
- [25]
B. Wang, K. Xia, and G. Wei.
Matched interface and boundary method for elasticity interface problems.
Journal of computational and applied mathematics, 285:203–225, 2015.
- [26]
Y. Wang, S. Cheung, E. Chung, Y. Efendiev, and M. Wang.
Deep multiscale model learning.
arXiv:1806.04830, 2018.
- [27]
E Weinan, Jiequn Han, and Arnulf Jentzen.
Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations.
Communications in Mathematics and Statistics, 5(4):349–380, 2017.
- [28]
E Weinan and Bing Yu.
The deep ritz method: A deep learning-based numerical algorithm for solving variational problems.
Communications in Mathematics and Statistics, 6(1):1–12, 2018.
- [29]
X. Yang, B. Li, and Z. Li.
The immersed interface method for elasticity problems with interface.
Dyn. Contin. Discrete Impuls. Syst. Ser. A Math. Anal., 10:783–808, 2003.
- [30]
D. Yarotsky.
Error bounds for approximations with deep ReLU networks.
Neural Networks, 94:103–114, 2017.
- [31]
Zhiwen Zhang, Phoebus Rosakis, Thomas Y Hou, and Guruswami Ravichandran.
A minimal mechanosensing model predicts keratocyte evolution on flexible substrates.
arXiv:1803.09220, 2018.
- [32]
Y. Zhu and N. Zabaras.
Bayesian deep convolutional encoder-decoder networks for surrogate modeling and uncertainty quantification.
Journal of Computational Physics, 366:415–447, 2018.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. Tensorflow: a system for large-scale machine learning. In OSDI , volume 16, pages 265–283, 2016.
- 2[2] I. Babuška. The finite element method for elliptic equations with discontinuous coefficients. Computing , 5(3):207–213, 1970.
- 3[3] C. Bernardi and R. Verfürth. Adaptive finite element methods for elliptic equations with non-smooth coefficients. Numerische Mathematik , 85(4):579–608, 2000.
- 4[4] L. Bottou. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT’2010 , pages 177–186. Springer, 2010.
- 5[5] Z. Chen and J. Zou. Finite element methods and their convergence for elliptic and parabolic interface problems. Numerische Mathematik , 79(2):175–202, 1998.
- 6[6] C. Chu, I. Graham, and T. Y. Hou. A new multiscale finite element method for high-contrast elliptic interface problems. Math. Comp. , 79:1915–1955, 2010.
- 7[7] N. Cohen, O. Sharir, and A. Shashua. On the expressive power of deep learning: A tensor analysis. In Conference on Learning Theory , pages 698–728, 2016.
- 8[8] G. Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems , 2(4):303–314, 1989.
