A general approach to transforming finite elements
Robert C. Kirby

TL;DR
This paper introduces a general method for transforming finite element bases, enabling the use of reference elements for a broader class of finite elements that previously could not be handled in this way.
Contribution
It develops a unified approach to map bases for complex finite elements like Hermite and Argyris, expanding the applicability of reference element techniques.
Findings
Provides a new framework for finite element basis transformation
Enables efficient implementation of complex finite elements
Broadens the class of finite elements compatible with reference element methods
Abstract
The use of a reference element on which a finite element basis is constructed once and mapped to each cell in a mesh greatly expedites the structure and efficiency of finite element codes. However, many famous finite elements such as Hermite, Morley, Argyris, and Bell, do not possess the kind of equivalence needed to work with a reference element in the standard way. This paper gives a generalizated approach to mapping bases for such finite elements by means of studying relationships between the finite element nodes under push-forward.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Numerical Methods in Computational Mathematics · Electromagnetic Simulation and Numerical Methods · Numerical methods in engineering
A general approach to transforming finite elements
Robert C. Kirby Department of Mathematics, Baylor University; One Bear Place #97328; Waco, TX 76798-7328. Email: [email protected]. This work was supported by NSF grant 1525697.
Abstract
The use of a reference element on which a finite element basis is constructed once and mapped to each cell in a mesh greatly expedites the structure and efficiency of finite element codes. However, many famous finite elements such as Hermite, Morley, Argyris, and Bell, do not possess the kind of equivalence needed to work with a reference element in the standard way. This paper gives a generalizated approach to mapping bases for such finite elements by means of studying relationships between the finite element nodes under push-forward. MSC 2010: 65N30. Keywords: Finite element method, basis function, pull-back.
1 Introduction
At the heart of any finite element implementation lies the evaluation of basis functions and their derivatives on each cell in a mesh. These values are used to compute local integral contributions to stiffness matrices and load vectors, which are assembled into a sparse matrix and then passed on to an algebraic solver. While it is fairly easy to parametrize local integration routines over basis functions, one must also provide an implementation of those basis functions. Frequently, finite element codes use a reference element, on which a set of basis functions is constructed once and mapped via coordinate change to each cell in a mesh. Alternately, many finite element bases can be expressed in terms of barycentric coordinates, in which case one must simply convert between the physical and barycentric coordinates on each cell in order evaluate basis functions. Although we refer the reader to recent results on Bernstein polynomials [1, 22] for interesting algorithms in the latter case, the prevelance of the reference element paradigm in modern high-level finite element software [4, 6, 24, 25, 30, 31] we shall restrict ourselves to the former.
The development of FIAT [21] has had a significant impact on finite element software, especially through its adoption in high-level software projects such as FEniCS [24] and Firedrake [31]. FIAT provides tools to describe and construct reference bases for arbitrary-order instances of many common and unusual finite elements. Composed with a domain-specific language for variational problems like UFL [2] and a form compiler mapping UFL into efficient code for element integrals [18, 23, 26] gives a powerful, user-friendly tool chain.
However, any code based on the reference element paradigm operates under the assumption that finite elements satisfy a certain kind of equivalence. Essentially, one must have a pull-back operation that puts basis functions on each cell into one-to-one correspondence with the reference basis functions. Hence, the original form of ffc [23] used only (arbitrary order) Lagrange finite elements, although this was generalized to and elements using Piola transforms in [32]. Current technology captures the full simplicial discrete de Rham complex and certain other elements, but many famous elements are not included. Although it is possible to construct reference elements in FIAT or some other way, current form compilers or other high-level libraries do not provide correct code for mapping them.
Elements such as Hermite [11], Argyris [3], Morley [28], and Bell [5], shown alongside the Lagrange element in Figure 1, do not satisfy the proper equivalence properties to give a simple relationship between the reference basis and nodal basis on a general cell. Typically, implementations of such elements require special-purpose code for constructing the basis functions separately on each element, which can cost nearly as much in terms of work and storage as building the element stiffness matrix itself. It also requires a different internal workflow in the code. Although Domínguez and Sayas [29] give a technique for mapping bases for the Argyris element and a separate computer implementation is available (https://github.com/VT-ICAM/ArgyrisPack) and Jardin [19] gives a per-element construction technique for the Bell element, these represents the exception rather than the rule. The literature contains no general approach for constructing and mapping finite element bases in the absence of affine equivalence or a suitable generalization thereof.
In this paper we provide such a general theory for transforming finite elements that supplements the theory on which FIAT is based for constructing those elements. Our focus is on the case of scalar-valued elements in affine spaces, although we indicate how the techniques generalize on both counts. We begin the rest of the paper by recalling definitions in § 2. The bulk of the paper occurs in § 3, where we show how to map finite element bases under affine equivalence, affine-interpolation equivalence, and when neither holds. We also sketch briefly how the theory is adapted to the case of more general pullbacks such as non-affine coordinate mappings or Piola transforms. All the theory in § 3 assumes that the natural pull-back operation (i.e. composition with coordinate change) exactly preserves the function spaces between reference and physical space. However, in certain notable cases such as the Bell element, this condition fails to hold. In § 4, we give a more general theory with application to the Bell element. Finally, in § 5, we present some numerical results using these elements.
2 Definitions and preliminaries
Througout, we let denote the space of functions with continuous and bounded derivatives up to and including order over , and its topological dual.
Definition 2.1**.**
A finite element is a triple such that
- •
* is a bounded domain.*
- •
* for some integer is a finite-dimensional function space.*
- •
* is a collection of linearly independent functionals whose actions restricted to form a basis for .*
The nodes in are taken as objects in the full infinite-dimensional dual, although sometimes we will only require their restrictions to members of . For any , define by restriction. That is, define for any .
Further, with a slight abuse in notation, we will let denote a functional on , or equivalently, a vector of members of the dual space.
As shorthand, we define these spaces consisting of vectors of functions or functionals by
[TABLE]
We can “vectorize” the restriction operator , so that for any , has .
Galerkin methods work in terms of a basis for the approximating space, and these are typically built out of local bases for each element:
Definition 2.2**.**
Let be a finite element with . The nodal basis for is the set such that for each .
The nodal basis also can be written as .
Traditionally, finite element codes construct the nodal basis for a reference finite element and then map it into the basis for for each in the mesh. Let be the geometric mapping, as in Figure 2. We let denote the Jacobian matrix of this transformation.
Similarly to (1), we define the vector spaces relative to the reference cell:
[TABLE]
As with , we define as the restriction of to , and can vectorize it over accordingly.
This geometric mapping induces a mapping between spaces of functions over and as well as between the dual spaces. These are called the pull-back, and push-forward operations, respectively:
Definition 2.3**.**
The pull-back operation mapping is defined by
[TABLE]
for each .
Definition 2.4**.**
The push-forward operation mapping the dual space into is defined by
[TABLE]
for each .
It is easy to verify that the pull-back and push-forward are linear operations preserving the vector space operations. Moreover, they are invertible iff itself is. Therefore, we have
Proposition 2.1**.**
Given finite elements and such that and , and are isomorphisms.
The pull-back and push-forward operations are also defined over the vector spaces , , , and . If is a vector of functionals and a vector of functions, then the vector push-forward and pull-back are, respectively
[TABLE]
It will also be useful to consider vectors of functionals acting on vectors of functions. We define this to produce a matrix as follows. If is a collection of functionals and a collection of functions, then we define the (outer) product to be the matrix
[TABLE]
For example, if is the vector of nodes of a finite element and contains the nodal basis functions, then the Kronecker delta property is expressed as
If is a matrix of numbers of appropriate shape and members of a function space , then is just defined by according to the usual rule for matrix-vector multiplication.
Lemma 2.1**.**
Let and and . Then
[TABLE]
Proof.
The proof is a simple calculation:
[TABLE]
∎
The relationship between pull-back and push-forward also leads to the vectorized relation
Lemma 2.2**.**
Let and . Then
[TABLE]
Definition 2.5**.**
Let and be finite elements and an affine mapping on . Then and are affine equivalent if
- •
,
- •
The pullback maps (in the sense of equality of vector spaces),
- •
* (in the sense of equality of finite sets).*
Definition 2.6**.**
Let be a finite element of class and its nodal basis. The nodal interpolant is defined by
[TABLE]
This interpolant plays a fundamental role in establishing approximation properties of finite elements via the Bramble-Hilbert Lemma [7, 14]. The homogeneity arguments in fact go through for the following generalized notion of element equivalence:
Definition 2.7**.**
Two finite elements and are interpolation equivalent if .
Definition 2.8**.**
If is affine equivalent to and interpolation equivalent to , then and are affine-interpolation equivalent.
Brenner and Scott [8] give the following result, of which we shall make use:
Proposition 2.2**.**
Finite elements and are interpolation equivalent iff the spans of and , (viewed as subsets of ), are equal.
For Lagrange and certain other finite elements, one simply has that , which allows for the traditional use of reference elements used in FEniCS, Firedrake, and countless other codes. However, for many other elements this is not the case. It is our goal in this paper to give a general approach that expresses as a linear transformation applied to .
Before proceeding, we note that approximation theory for Argyris and other families without affine-interpolation equivalence can proceed by means of establishing the almost-affine property [10]. Such proofs can involve embedding the inequivalent element family into an equivalent one with the requisite approximation properties. For example, the Argyris element is proved almost-affine by comparison to the “type (5)” quintic Hermite element. Although we see definite computational consequences of affine-equivalence, affine-interpolation equivalence, and neither among our element families, we our approach to transforming inequivalent families does not make use of any almost-affine properties.
3 Transformation theory when
For now, we assume that the pull-back operation (3) appropriately converts the reference element function space into the physical function space and discuss the construction of nodal bases based on relationships between the reference nodes and the pushed-forward physical nodes .
We focus on the simplicial case, although generalizations do not have a major effect, as we note later. Throughout, we will use following convention, developed in [32] for handling facet orientation in mixed methods but also useful in order higher-order Lagrange degrees of freedom. Since our examples are triangles (2-simplices), it is not necessary to expand on the entire convention. Given a triangle with vertices , we define edge of the triangle to connect the vertices other than . The (unit) tangent vector , points in the direction from the lower- to the higher-numbered vertex. When triangles share an edge, then, they agree on its orientation. The normal to an edge is defined by rotating the tangent by applying the matrix so that We also let denote the midpoint of .
Now, we fix some notation for describing nodes. First, we define acting on any continuous function by pointwise evaluation. That is:
[TABLE]
We let denote the directional derivative in direction at a point , so that
[TABLE]
We use repeated superscripts to indicate higher-order derivatives, so that defines the second directional derivative along the -axis at point .
It will also be convenient to use block notation, with a single symbol representing two or items. For example, the gradient notation
[TABLE]
gives the pair of functionals evaluating the Cartesian derivatives at a point . To denote a gradient in a different basis, we append the directions as superscripts so that
[TABLE]
contains the normal and tangential derivatives at a point .
Similarly, we let
[TABLE]
denote the vector of three functionals evaluating the unique (supposing sufficient smoothness) second partials at .
Let be the nodal basis for a finite element and that for a reference element . We also assume that and . Because the pull-back is invertible, it maps linearly independent sets to linearly independent sets. So, must also be a basis for . There exists an invertible matrix such that
[TABLE]
or equivalently, that each nodal basis function is some linear combination of the pull-backs of the reference nodal basis functions.
Our theory for transforming the basis functions (i.e. computing the matrix ) will work via duality – relating the matrix to how the nodes, or at least their restrictions to the finite-dimensional spaces, push forward.
It will be useful to define as an intermediate matrix . Recall from (6) that its entries for are
[TABLE]
This matrix, having nodes only applied to members of is indifferent to restrictions and so as well.
Because of Proposition 2.1 and finite-dimensionality, the the nodal sets and are both bases for , and so there exists an invertible matrix such that
[TABLE]
Frequently, it may be easier to express the pushed-forward nodes as a linear combination of the reference nodes. In this case, one obtains the matrix . At any rate, the matrices and are closely related.
Theorem 3.1**.**
For finite elements and with and , the matrices in (12) and (14) satisfy
[TABLE]
Proof.
We proceed by relating both matrices to defined in (13) via the Kronecker property of nodal bases. First, we have
[TABLE]
so that
[TABLE]
Similarly,
[TABLE]
so that and the result follows. ∎
That is, to relate the pullback of the reference element basis functions to any element’s basis functions, it is sufficient to determine the relationship between the nodes.
3.1 Affine equivalence: The Lagrange element
When elements form affine-equivalent families, the matrix has a particularly simple form.
Theorem 3.2**.**
If and are affine-equivalent finite elements then the transformation matrix is the identity.
Proof.
Suppose the two elements are affine-equivalent, so that . Then, a direct calculation gives
[TABLE]
so that . ∎
The Lagrange elements are the most widely used finite elements and form the prototypical affine-equivalent family [8]. For a simplex in dimension and integer , one defines to be the space of polynomials over of total degree no greater than , which has dimension . The nodes are taken to be pointwise evaluation at a lattice of points. Classically, these are taken to be regular and equispaced, although options with superior interpolation and conditioning properties for large are also known [17]. One must ensure that nodal locations are chosen at the boundary to enable continuity between adjacent elements. A cubic Lagrange triangle ( and ) is shown earlier in Figure 1(a).
The practical effect of Theorem 3.2 is that the reference element paradigm “works.” That is, a computer code contains a routine to evaluate the nodal basis and its derivatives for a reference element . Then, this routine is called at a set of quadrature points in . One obtains values of the nodal basis at quadrature points on each cell by pull-back, so no additional work is required. To obtain the gradients of each basis function at each quadrature point, one simply multiplies each basis gradient at each point by .
On the other hand, when , the usage of tabulated reference values is more complex. Given a table
[TABLE]
of the reference basis at the reference quadrature points, one finds the nodal basis for by constructing for that element and then computing the matrix-vector product so that
[TABLE]
Mapping gradients from the reference element requires both multiplication by as well as application of by the chain rule. We define by
[TABLE]
Then, the basis gradients requires contraction with
[TABLE]
followed by the chain rule
[TABLE]
In fact, the application of and can be performed in either order. Note that applying requires an matrix-vector multiplication and in principle couples all basis functions together, while applying works pointwise on each basis function separately. When is quite sparse, one expects this to be a small additional cost compared to the other required arithmetic. We present further details for this in the case of Hermite elements, to which we now turn.
3.2 The Hermite element: affine-interpolation equivalence
The Hermite triangle [11], show in Figure 1(b) is based cubic polynomials, although higher-order instances can also be defined [8]. In contrast to the Lagrange element, its node set includes function values and derivatives at the nodes, as well as an interior function value. The resulting finite element spaces have continuity with continuity at vertices. They provide a classic example of elements that are not affine equivalent but instead give affine-interpolation equivalent families.
We will let be a cubic Hermite triangle, specifying the gradient at each vertex in terms of the Cartesian derivatives – see Figure 3(b). Let be the three vertices of and its barycenter. We order the nodes by
[TABLE]
using block notation.
Now, we fix the reference element with as the unit right triangle and express the gradient by the derivatives in the direction of the reference Cartesian coordinates, as in Figure 3(a). Let be the three vertices of and its barycenter. We define analogously to .
Consider the relationship between the nodal basis functions and the pulled-back . For any , the chain rule leads to
[TABLE]
Now, suppose that is a nodal basis function corresponding to evaluation at a vertex or the barycenter, so that for some , with the remaining reference nodes vanishing on . We compute that
[TABLE]
while for with . Also, since the reference gradient of vanishes at each vertex, (23) implies that the physical gradient of must also vanish at each vertex. So, pulling back gives the corresponding nodal basis function for .
The situation changes for the derivative basis functions. Now take to be the basis function with unit-valued derivative in, say, the direction at vertex and other degrees of freedom vanishing. Since it vanishes at each vertex and the barycenter of , will vanish at each vertex and the barycenter of . The reference gradient of vanishes at the vertices other than , so the physical gradient of its pullback must also vanish at the corresponding vertices of . However, (23) shows that will typically not yield at . Consequently, the pull-backs of the reference derivative basis functions do not produce the physical basis functions.
Equivalently, we may express this failure in terms of the nodes – pushing forward does not yield . We demonstrate this pictorially in Figure 4, showing the images of the derivative nodes under push-forward do not correspond to the reference derivative nodes. Taking this view allows us to address the issue using Theorem 15.
This discussion using the chain rule can be summarized by the matrix-valued equation
[TABLE]
noting that the second, fourth, and sixth rows and columns of this matrix are blocks of two, and each “[math]” is taken to be the zero matrix of appropriate size. This is exactly the inverse of from Theorem 15.
In this case, the transformation is quite local – that is, only the push-forward of nodes at a given point are used to construct the reference nodes at the image of that point. This seems to be generally true for interpolation-equivalent elements, although functionals with broader support (e.g. integral moments over the cell or a facet thereof) would require a slight adaptation. We will see presently for Morley and Argyris elements that the transformation neeed not be block diagonal for elements without interpolation equivalence. At any rate, the following elementary observation from linear algebra suggests the sparsity of :
Proposition 3.1**.**
Let be a vector space with sets of vectors and . Suppose that so that there exists a matrix such that . If we further have that some for some , then for all .
Our theory applies equally to the general family of Hermite triangles of degree . In those cases, the nodes consist of gradients at vertices together with point-wise values at appropriate places. All higher-order cases generate families of elements with -continuity at vertices. The matrix remains analogous to the cubic case, with on the diagonal in three places corresponding to the vertex derivative nodes. No major differences appear for the tetradral Hermite elements, either.
As we saw earlier, Hermite and other elements for which incur an additional cost in mapping from the reference element, as one must compute basis function values and gradients via (18) and (21). The key driver of this additional cost is the application of . Since is very sparse for Hermite elements – just 12 nonzeros counting the 1’s on the diagonal – evaluating (18) requires just operations per column, so a 10-point quadrature rule requires 120 operations. Evaluating (20) requires twice this, or 240 operations. Applying in (21) is required whether Hermite or Lagrange elements are used. It requires times the number of quadrature points used – so a 10-point rule would require 400 operations. Hence, the chain rule costs more than the application of in this situation. On the other hand, building an element stiffness matrix requires a double loop over these 10 basis functions nested with a loop over the, say, 10 quadrature points. Hence, the loop body requires 1000 iterations, and with even a handful of operations will easily dominate the additional cost of multiplying by .
3.3 The Morley and Argyris elements
The construction of finite elements, required for problems such as plate bending or the Cahn-Hilliard equations, is a long-standing difficulty. Although it is possible to work around this requirement by rewriting the fourth-order problem as a lower order system or by using elements in conjunction with variational form penalizing the jumps in derivatives [15, 33], this doesn’t actually give a solution.
The quadratic Morley triangle [28], shown in Figure 1(c), finds application in plate-bending problems and also provides a relatively simple motivation for and application of the theory developed here. The six degrees of freedom, vertex values and the normal derivatives on each edge midpoint, lead to an assembled finite element space that is neither nor , but it is still suitable as a convergent nonconforming approximation for fourth-order problems.
The quintic Argyris triangle [3], shown in Figure 1(d), with its 21 degrees, gives a proper finite element. Hence it can be used generically for fourth-order problems as well as second-order problems for which a continuously differentiable solution is desired. The Argyris elements use the values, gradients, and second derivatives at each triangle vertex plus the normal derivatives at edge midpoints as the twenty-one degrees of freedom.
It has been suggested that the Bell element [5] represents a simpler element than the Argyris element, on the account that it has fewer degrees of freedom. Shown in Figure 1(e), we see that the edge normal derivatives have been removed from the Argyris element. However, this comes with a (smaller but) more complicated function space. Rather than full quintic polynomials, the Bell element uses quintic polynomials that have normal derivatives on each edge of only third degree. This constraint on the polynomial space turns out to complicate the transformation of Bell elements compared to Hermite or even Argyris. For the rest of this section, we focus on Morley and Argyris, returning to Bell later.
It can readily be seen that, like the Hermite element, the standard affine mapping will not preserve nodal bases. Unlike the Hermite element, however, the Morley and Argyris elements do not form affine-interpolation equivalent families – the spans of the nodes are not preserved under push-forward thanks to the edge normal derivatives – see Figure 5. As the Morley and Aryris nodal sets do not contain a full gradient at edge midpoints, the technique used for Hermite elements cannot be directly applied.
To work around this, we introduce the following idea:
Definition 3.1**.**
Let and be finite elements of class with affine mapping and associated pull-back and push-forward and . Suppose also that . Let and be such that
- •
* (taken as sets rather than vectors),*
- •
* (again as sets),*
- •
* in .*
Then and form a compatible nodal completion of and .
Example 3.1**.**
Let and be the Morley triangle and reference triangle. Take to contain all the nodes of together with the tangential derivatives at the midpoint of each edge of and similarly for . In this case, . Then, both and contain complete gradients at each edge midpoint and function values at each vertex. The push-forward of has the same span as and so and form a compatible nodal completion of and . This is shown pictorially in Figure 6.
A similar completion – supplementing the nodes with tangential derivatives at edge midpoints – exists for the Argyris nodes and reference nodes [29].
Now, since the spans of and agree (even in ), there exists a matrix , typically block diagonal, such that
[TABLE]
Let be the Boolean matrix with iff so that
[TABLE]
and it is clear that
[TABLE]
That is, the reference nodes are linear combinations of the pushed-forward nodes and the extended nodes, but we must have the linear combination in terms of the pushed-forward nodes alone.
Recall that building the nodal basis only requires the action of the nodes on the polynomial space. Because , the set of nodes must be linearly dependent. So, we seek a matrix such that
[TABLE]
Since is an isomorphism, such a also gives
[TABLE]
Rows of the matrix such that for some will just have for . The remaining rows must be constructed somehow via an interpolation argument, although the details will vary by element.
This discussion suggests a three-stage process, each encoded by matrix multiplication, for converting the push-forwards of the physical nodes to the reference nodes, hence giving a factored form of in (14). Before working examples, we summarize this in the following theorem:
Theorem 3.3**.**
Let and be finite elements with affine mapping and suppose that . Let and be a compatible nodal completion of and . Then given matrices from (26), from (25) and from (28) that builds the (restrictions of) the extended nodes out of the given physical nodes, the nodal transformation matrix satisfies
[TABLE]
This gives a general outline for mapping finite elements, and we illustrate now by turning to the Morley element.
3.3.1 The Morley element
Following our earlier notation for the geometry and nodes, we order the nodes of a Morley triangle by
[TABLE]
Nodes will also include tangential derivatives at the edge midpoint. We put
[TABLE]
Again, this is a block vector the last three entries each consist of two values. We give the same ordering of reference element nodes and .
The matrix simply extracts the members of that are also in , so with , we have the block matrix
[TABLE]
Because the gradient nodes in use normal and tangential coordinates, will be slightly more more complicated than for the Hermite element. For local edge , we define the (orthogonal) matrix
[TABLE]
with the normal and tangent vector in the rows. Similarly, we let
[TABLE]
contain the unit normal and tangent to edge of the reference cell . It is clear that
[TABLE]
so, defining
[TABLE]
we have that
[TABLE]
Now, we turn to the matrix , writing members of in terms of alone. The challenge is to express the tangential derivative nodes in terms of the remaining six nodes – vertex values and normal derivatives. In fact, only the vertex values are needed. Along any edge, any member of is just a univariate quadratic polynomial, and so the tangential derivative is linear. Linear functions attain their average value over an interval at its midpoint. But the average value of the derivative over the edge is just the difference between vertex values divided by the edge length. The matrix must be
[TABLE]
We can also arrive at this formulation of in another way, that sets up the discussion used for Argyris and later Bell elements. Consider the following univariate result:
Proposition 3.2**.**
Let any quadratic polynomial on . Then
[TABLE]
Proof.
Write . Then so that . Also note that and . Wanting to write for constants and leads to a linear system, which is readily solved to give . ∎
Then, by a change of variables, this rule can be mapped to so that
[TABLE]
Finally, one can apply this rule on the edge of a triangle running from to to find that
[TABLE]
It is interesting to explicitly compute the product , as giving a single formula rather than product of matrices is more useful in practice. Multiplying through gives:
[TABLE]
From the definition of , it is possibly to explicitly calculate its entries in terms of the those of the Jacobian and the normal and tangent vectors for and . Only the first row of each is needed
[TABLE]
We can also recall that the normal and tangent vectors are related by and to express these entries purely in terms of either the normal or tangent vectors. Each entry of the Jacobian and normal and tangent vectors of and enter into the transformation.
In this form, has 12 nonzero entries, although the formation of those entries, which depend on normal and tangent vectors and the Jacobian, from the vertex coordinates requires an additional amount of arithmetic. The Jacobian will typically be computed anyway in a typical code, and the cost of working with will again be subdominant to the nested loops over basis functions and quadrature points required to form element matrices, much like Hermite.
3.3.2 The Argyris element
Because it is higher degree than Morley and contains second derivatives among the nodes, the Argyris transformation is more involved. However, it is a prime motivating example and also demonstrates that the general theory here reproduces the specific technique in [29]. The classical Argyris element has as polynomials of degree 5 over a triangle , a 21-dimensional space. The 21 associated nodes are selected as the point values, gradients, and all three unique second derivatives at the vertices together with the normal derivatives evaluated at edge midpoints. These nodal choices lead to a proper element, and continuity is obtained at vertices.
Since the Argyris elements do not form an affine-interpolation equivalent family, we will need to embed the physical nodes into a larger set. Much as with Morley elements, the edge normal derivatives will be augmented by the tangential derivatives.
With this notation, is a vector of 21 functionals and a vector of 24 functions written as
[TABLE]
with corresponding ordering of reference nodes and . The matrix just selects out the items in that are also in , so that
[TABLE]
The matrix relating the push-forward of the extended nodes to the extended reference nodes is block diagonal and similar to our earlier examples. We use (23) to map the vertex gradient nodes as in the Hermite case. Mapping the three unique second derivatives by the chain rule requires the matrix:
[TABLE]
The edge midpoint nodes transform by just as in (35), so that the is
[TABLE]
Constructing , like for Morley, is slightly more delicate. The additional nodes acting on quintic polynomials – tangential derivatives at edge midpoints – must be written in terms of the remaining nodes. The first aspect of this involves a univariate interpolation-theoretic question. On the biunit interval , we seek a rule of the form
[TABLE]
that is exact when is a quintic polynomial. The coefficients may be determined to by writing a linear system asserting correctness on the monomial basis. The answer, given in [29], is that
Proposition 3.3**.**
Any quintic polynomial defined on satisfies
[TABLE]
This can be mapped to the interval by a change of variables:
[TABLE]
Now, we can use this to compute the tangential derivative at an edge midpoint, expanding the tangential first and second derivatives in terms of the Cartesian derivatives. If and are the beginning and ending vertex of edge with midpoint and length , we write the tangential derivative acting on quintics as
[TABLE]
For each edge , define the vector by
[TABLE]
The end result is that
[TABLE]
If this transformation is kept in factored form, contains 57 nonzero entries and contains 54 nonzero entries. is just a Boolean matrix and its application requires copies. So, application of requires no more than floating-point operations, besides the cost of forming the entries themselves. While this is about ten times the cost of the Hermite transformation, it is for about twice the number of basis functions and still well-amortized over the cost of integration loops. Additionally, one can multiply out the product symbolically and find only 81 nonzero entries, which reduces the cost of multiplication accordingly.
3.4 Generalizations
3.4.1 Non-affine mappings
Non-affine geometric transformations, whether for simplicial or other element shapes, present no major complications to the theory. In this case, and are related by a non-affine map, and is taken to be the image of under pull-back
[TABLE]
although this space need not consist of polynomials for non-affine . At any rate, one may define Hermite elements on curvilinear cells [10, 13]. In this case, the Jacobian matrix varies spatially so that each instance of in (24) must be replaced by the particular value of at each vertex.
3.4.2 Generalized pullbacks
Many vector-valued finite element spaces make use of pull-backs other than composition with affine maps. For example, the Raviart-Thomas and Nédélec elements use contravariant and covariant Piola maps, respectively. Because these preserve either normal or tangential components, one can put the nodal basis functions of a given element and reference element into one-to-one correspondence by means of the Piola transform, a fact used heavily in [32] possible. It would be straightforward to give a generalization of affine equivalence to equivalence under an arbitrary pull-back , with push-forward defined in terms of . In this case, the major structure of § 3.1 would be unchanged.
However, not all elements form equivalent families under the contravariant Piola transform. For example, Mardal, Tai, and Winther [27] give an element that can be paired with discontinuous polynomials to give uniform inf-sup stability on a scale of spaces between and , although it is -nonconforming. The degrees of freedom include constant and linear moments of normal components on edges, which are preserved under Piola mapping. However, the nodes also include the constant moments of the tangential component on edges, which are not preserved under Piola transform. One could push-forward both the normal and tangential constant moments, then express them as a linear combination of the normal and tangential moments on the reference cell in a manner like (24). One could see the Mardal–Tai–Winther element as satisfying a kind of “Piola-interpolation equivalence” and readily adapt the techniques for Hermite elements,
3.5 A further note on computation
We have commented on the added cost of multiplying the set of basis functions by during local integration. It also also possible to apply the transformation in a different way that perhaps more fully leverages pre-existing computer routines. With this approach, can also be included in local matrix assembly by means means of a congruence transform acting on the “wrong” element matrix as follows.
Given a finite element with nodal basis and bilinear form over the domain , we want to compute the matrix
[TABLE]
Suppose that a computer routine existed for evaluating via a reference mapping for affine-equivalent elements. That is, given the mapping , this routine maps all integration to the reference domain assuming that the integrand over is just the affine pull-back of something on . Consider the following computation:
[TABLE]
Now, this is just expressed in terms of the affine pullback of reference-element integrands and so could use the hypothesized computer routine. We then have
[TABLE]
or, more compactly,
[TABLE]
where is the matrix one would obtain by using the pull-back of the reference element nodal basis functions instead of the actual nodal basis for . Hence, rather than applying invasively at each quadrature point, one may use existing code for local integration and pre- and post-multiply the resulting matrix by the basis transformation. In the case of Hermite, for example, applying to a vector costs 12 operations, so applying to all 10 columns of costs 120 operations, plus another 120 for the transpose. This adds 240 extra operations to the cost of building , or just 2.4 extra FLOPs per entry of the matrix.
One may also apply this idea in a “matrix-free” context. Given a routine for applying to a vector, one may simply apply to the input vector, apply to the result, and post-multiply by . Hence, one has the cost of muliplying by plus the cost of applying and its transpose to a single vector. In the case of Hermite, one has the cost of computing the “wrong” local matrix-vector product via an existing kernel plus 24 additional operations.
Finally, we comment on evaluating discrete functions over elements requiring such transforms. Discrete function evaluation is frequently required in matrix-free computation, nonlinear residual evaluation, and in bilinear form evaluation when a coefficient is expressed in a finite element space. Suppose one has on a local element a function expressed by
[TABLE]
where is the vector of coefficients and is the nodal basis for . In terms of pulled-back reference basis functions, is given by
[TABLE]
which can also be written as
[TABLE]
Just as one can build element matrices by means of the “wrong” basis functions and a patch-up operation, one can also evaluate functions by transforming the coefficients and then using the standard pullback of the reference basis functions. Such observations may make incorporating nonstandard element transformations into existing code more practical.
4 What if ?
The theory so far has been predicated on providing an isomorphism between the reference and physical function spaces. In certain cases, however, this fails. Our main motivation here is to transform the Bell element, a near-relative of the quintic Argyris element. In this case, one takes to be the subspace of that has cubic normal derivatives on edges rather than the typical quartic values. This reduction of by three dimensions is accompanied by removing the three edge normal derivatives at midpoints from . In general, however, the pull-back does not coincide with . Instead of cubic normal derivatives on edges, has reduced degree in some other direction corresponding to the image of the normal under affine mapping. The theory developed earlier can be extended somewhat to resolve this situation.
4.1 General theory: extending the finite element
Abstractly, one may view the Bell element or other spaces built by constraint as the intersection of the null spaces of a collection of functionals acting on some larger space as follows. Let be a finite element. Suppose that and that are linearly independent functionals that when acting on satisfy
[TABLE]
The following result is not difficult to prove:
Proposition 4.1**.**
Let be a finite element with as per (53). Similarly, let Let be a reference element with . Suppose that Then iff
[TABLE]
In the case of the Bell element, the span condition (54) fails and so that the function space is not preserved under affine mapping. Consequently, the theory of the previous section predicated on this preservation does not directly apply. Instead, we proceed by making the following observation.
Proposition 4.2**.**
Let be a finite element with satisfying for linearly independent functionals . Define
[TABLE]
to include the nodes of together with . Then is a finite element.
Proof.
Since we have a finite-dimensional function space, it remains to show that is linearly independent and hence spans . Consider a linear combination in
[TABLE]
Apply this linear combination to any to find
[TABLE]
since for . Because is a finite element, the are linearly independent in so for . Applying the same linear combination to any then gives that since the constraint functionals are also linearly independent. ∎
Given a nodal basis , it is easy to obtain one for .
Proposition 4.3**.**
Let , , and be as in Proposition 4.2. Order the nodes in by with for . Let be the nodal basis for . Then is the nodal basis for .
Proof.
Clearly, for by the ordering of the nodes in . Moreover, because for each . ∎
4.2 The Bell element
So, we can obtain a nodal basis for the Bell element or others with similarly constrained function spaces by mapping the nodal basis for a slightly larger finite element and extracting a subset of the basis functions. Let and be the Bell elements over and reference cell .
Recall that the Legendre polynomial of degree is orthogonal to polynomials of degree or less. Let be the Legendre polynomial of degree mapped from the biunit interval to edge of . Define a functional
[TABLE]
For any , its normal derivative on edge is cubic iff . So, the constraint functionals are given in and as in Proposition 4.2. We define
[TABLE]
and hence as well as and in a similar way.
and are the constrained spaces – quintic polynomials with cubic normal derivatives on edges, while and are the spaces of full quintic polynomials over and , respectively. We must construct a nodal basis for , map it to a nodal basis for by the techniques in Section 3, and then take the subset of basis functions corresponding to the Bell basis.
This is accomplished by specifying a compatible nodal extension of and by including the edge moments of tangential derivatives against with those of and . We define
[TABLE]
We must specify the , , and matrices for this extended set of finite element nodes. We focus first on , needing to compute each in terms of the remaining functionals. As with Morley and Argyris, we begin with univariate results.
The following is readily confirmed, for example, by noting the right-hand side is a quintic polynomial and computing values and first and second derivatives at :
Proposition 4.4**.**
Let be any quintic polynomial on . Then
[TABLE]
The formula (58) can be differentiated and then integrated against to show that
[TABLE]
Then, this can be mapped to a general interval by a simple change of variables:
[TABLE]
Now, we can use this to express the functionals from (57) as linear combinations of the Bell nodes:
Proposition 4.5**.**
Let be a triangle and and are the beginning and ending vertex of edge with length . Let be any bivariate quintic polynomial over and defined in (57). Then the restriction of to bivariate quintic polynomials satisfies
[TABLE]
and hence
[TABLE]
Now, is quite similar to that for the Argyris element. There is a slight difference in the handling the edge nodes, for we have an integral moment instead of a point value and must account for the edge length accordingly. By converting between normal/tangent and Cartesian coordinates via the matrix and mapping to the reference element, we find that for any ,
[TABLE]
This calculation shows that for the Bell element is identical to (43) for Argyris, except with a geometric scaling of the matrices.
The extraction matrix for the extended Bell elements consisting of full quintics now is identical to that for Argyris. Then, when evaluating basis functions, one multiplies the affinely mapped set of basis values by and then takes only the first 18 entries to obtain the local Bell basis.
4.3 A remark on the Brezzi-Douglas-Fortin-Marini element
In [21], we describe a two-part process for computing the triangular Brezzi-Douglas-Fortin-Marini (BDFM) element [16], an conforming finite element based on polynomials of degree with normal components constrained to have degree . This is a reduction of the Brezzi-Douglas-Marini element [9] somewhat as Bell is of Argyris. However, as both elements form Piola-equivalent families, the transformation techniques developed here are not needed.
Like the Bell element, one can define constraint functionals (integral moments of normal components against the degree Legendre polynomial) for BDFM. In [21], we formed a basis for the intersection of the null spaces of these functionals by means of a singular value decomposition. A nodal basis for the BDFM space then followed by building and inverting a generalized Vandermonde matrix on the basis for this constrained space.
In light of Propositions 4.2 and 4.3, however, this process was rather inefficient. Instead, we could have merely extended the BDFM nodes by the constraint functionals, building and inverting a single Vandermonde-like matrix. If one takes the BDM edge degrees of freedom as moments of normal components against Legendre polynomials up to degree instead of pointwise normal values, then one can even build a basis for BDM that includes a a basis for BDFM as a proper subset.
5 Numerical results
Incorporation of these techniques into high-level software tools such as Firedrake is the subject of ongoing investigation. In the meantime, we provide some basic examples written in Python, with sparse matrix assemble and solvers using petsc4py [12].
5.1 Scaling degrees of freedom
Before considering the accuracy of the projection, achieved via the global mass matrix, we comment on the conditioning of the mass and other matrices when both derivative and point value degrees of freedom appear. The Hermite element is illustrative of the situation.
On a cell of typical diameter , consider a basis function corresponding to the point value at a given vertex. Since the vertex basis function has a size of on a triangle of size , its norm should be . Now, consider a basis function corresponding to a vertex derivative. Its derivative is now on the cell, so that the seminorm is . Inverse inequalities suggest that the norm could then be as large as . That is, the different kinds of nodes introduce multiple scales of basis function sizes under transformation, which manifests in ill-conditioning. Where one expects a mass matrix to have an condition number, one now obtains an condition number. This is observed even on a unit square mesh, in Figure 7. All condition numbers are computed by converting the PETSc mass matrix to a dense matrix and using LAPACK via scipy [20]
However, there is a simple solution. For the Hermite element, one can scale the derivative degrees of freedom locally by an “effective ”. All cells sharing a given vertex must agree on that , which could be the average cell diameter among cells sharing a vertex. Scaling the nodes/basis functions (which amounts to multiplying on the right by a diagonal matrix with 1’s or ’s) removes the scale separation among basis functions and leads again to an condition number for mass matrices, also seen in Figure 7. From here, we will assume that all degrees of freedom are appropriately scaled to give conditioning for the mass matrix.
5.2 Accuracy of projection
Now, we demonstrate that optimal-order accuracy is obtained by performing projection of smooth functions into the Lagrange, Hermite, Morley, Argyris, and Bell finite element spaces. In each case we use an mesh divided into right triangles. Defining on , we seek such that
[TABLE]
for each , where is one of the the finite element spaces. Predicted asymptotic convergence rates – third for Morley, fourth for Hermite and Lagrange, fifth for Bell, and sixth for Argyris, are observed in Figure 8.
Note that the Hermite and Lagrange elements have the same order of approximation, but the Lagrange element delivers a slightly lower error. This is to be expected, as the space spanned by cubic Hermite triangles is a proper subset of that spanned by Lagrange.
5.3 The Laplace operator
As a simple second-order elliptic operator, we consider the Dirichlet problem for the Laplace operator on the unit square :
[TABLE]
equipped with homogeneous Dirichlet boundary conditions on .
We divide into an mesh of triangles and let be one of the Lagrange, Hermite, Argyris, or Bell finite element spaces, all of which are -conforming, over this mesh. The Morley element is not a suitable nonconforming element, so we do not use it here. We then seek such that
[TABLE]
for all .
Enforcing strong boundary conditions on elements with derivative degrees of freedom is delicate in general. However, with grid-aligned boundaries, it is less difficult. To force a function to be zero on a given boundary segment, we simply require the vertex values and all derivatives tangent to the edge vanish. This amounts to setting the -derivatives on the top and bottom edges of the box and -derivative on the left and right for Hermite, Argyris, and Bell elements. Dirichlet conditions for Lagrange are enforced in the standard way.
By the method of manufactured solutions, we select so that . In Figure 9, we show the error in the computed solution for both element families. As the mesh is refined, both curves approach the expected order of convergence – fourth for Hermite and Lagrange, fifth for Bell, and sixth for Argyris. Again, the error for Lagrange is slightly smaller than for Hermite, albeit with more global degrees of freedom.
5.4 The clamped plate problem
We now turn to a fourth-order problem for which the Argyris and Bell elements provide conforming discretizations and Morley a suitable nonconforming one. Following [8], we take the bilinear form defined on to be
[TABLE]
where yields a coercive bilinear form for any closed subspace of that does not contain nontrivial linear polynomials. We fix .
Then, we consider the variational problem
[TABLE]
posed over suitable subspaces of . It is known [8] that solutions of (68) that lie in satisfy the biharmonic equation in an sense.
We consider the clamped plate problem, in which both the function value and outward normal derivative are set to vanish, which removes nontrivial linear polynomials from the space. Again, we use the method of manufactured solutions on the unit square to select such that , which satifies clamped boundary conditions. We solve this problem with Argyris and Bell elements, and then also use the nonconforming Morley element in the bilinear form. Again, expected orders of convergence are observed in Figure 10.
6 Conclusions
Many users have wondered why FEniCS, Firedrake, and most other high-level finite element tools lack the full array of triangular elements, including Argyris and Hermite. One answer is that fundamental mathematical aspects of mapping such elements have remained relatively poorly understood. This work demonstrates the challenges involved with mapping such elements from a reference cell, but also proposes a general paradigm for overcoming those challenges by embedding the nodes into a larger set that transforms more cleanly and using interpolation techniques to relate the additional nodes back to original ones. In the future, we hope to incorporate these techniques in FInAT (https://github.com/FInAT/FInAT), a successor project to FIAT that produces abstract syntax for finite element evaluation rather than flat tables of numerical values. TSFC [18] already relies on FInAT to enable sum-factorization of tensor-product bases. If FInAT can provide rules for evaluating the matrix in terms of local geometry on a per-finite element basis, then TSFC and other form compilers should be able to seamlessly (from the end-users’ perspective) generate code for many new kinds of finite elements.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Mark Ainsworth, Gaelle Andriamaro, and Oleg Davydov. Bernstein-Bézier finite elements of arbitrary order and optimal assembly procedures. SIAM Journal on Scientific Computing , 33(6):3087–3109, 2011.
- 2[2] Martin S. Alnæs, Anders Logg, Kristian B. Ølgaard, Marie E. Rognes, and Garth N. Wells. Unified form language: A domain-specific language for weak formulations of partial differential equations. ACM Transactions on Mathematical Software , 40(2):9, 2014.
- 3[3] J. H. Argyris, I. Fried, and D. W. Scharpf. The TUBA family of plate elements for the matrix displacement method. Aeronautical Journal , 72:701–709, 1968.
- 4[4] Wolfgang Bangerth, Rolf Hartmann, and Guido Kanschat. deal.II — a general purpose object oriented finite element library. ACM Trans. Math. Softw. , 33(4), 2007.
- 5[5] Kolbein Bell. A refined triangular plate bending finite element. International Journal for Numerical Methods in Engineering , 1(1):101–122, 1969.
- 6[6] Pavel B. Bochev, H. Carter Edwards, Robert C. Kirby, Kara Peterson, and Denis Ridzal. Solving PD Es with Intrepid. Scientific Programming , 20(2):151–180, 2012.
- 7[7] James H. Bramble and S. R. Hilbert. Bounds for a class of linear functionals with applications to Hermite interpolation. Numerische Mathematik , 16(4):362–369, 1971.
- 8[8] Susanne C. Brenner and L. Ridgway Scott. The mathematical theory of finite element methods , volume 15 of Texts in Applied Mathematics . Springer, New York, third edition, 2008.
