D\"orfler marking with minimal cardinality is a linear complexity problem
Carl-Martin Pfeiler, Dirk Praetorius

TL;DR
This paper presents an algorithm for Dörfler marking in adaptive finite element methods that constructs a minimal element set with linear computational complexity, improving efficiency in mesh refinement strategies.
Contribution
The paper introduces and analyzes a novel algorithm that achieves minimal Dörfler marking with linear complexity, addressing a key challenge in adaptive finite element methods.
Findings
Algorithm constructs minimal marking sets efficiently
Achieves linear computational complexity
Provides pseudocode for implementation
Abstract
Most adaptive finite element strategies employ the D\"orfler marking strategy to single out certain elements of a triangulation for refinement. In the literature, different algorithms have been proposed to construct , where usually two goals compete: On the one hand, should contain a minimal number of elements. On the other hand, one aims for linear costs with respect to the cardinality of . Unlike expected in the literature, we formulate and analyze an algorithm, which constructs a minimal set at linear costs. Throughout, pseudocodes are given.
| sort | xStar | sort | xStar | sort | xStar | sort | xStar | sort | xStar | |
|---|---|---|---|---|---|---|---|---|---|---|
| sort | xStar | sort | xStar | sort | xStar | sort | xStar | sort | xStar | |
|---|---|---|---|---|---|---|---|---|---|---|
| sort | xStar | sort | xStar | sort | xStar | sort | xStar | sort | xStar | |
|---|---|---|---|---|---|---|---|---|---|---|
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Dörfler marking with minimal cardinality
is a linear complexity problem
Carl-Martin Pfeiler and Dirk Praetorius
TU Wien, Institute for Analysis and Scientific Computing, Wiedner Hauptstr. 8-10/E101/4, 1040 Vienna, Austria
[email protected] (corresponding author)
Abstract.
Most adaptive finite element strategies employ the Dörfler marking strategy to single out certain elements of a triangulation for refinement. In the literature, different algorithms have been proposed to construct , where usually two goals compete: On the one hand, should contain a minimal number of elements. On the other hand, one aims for linear costs with respect to the cardinality of . Unlike expected in the literature, we formulate and analyze an algorithm, which constructs a minimal set at linear costs. Throughout, pseudocodes are given.
Key words and phrases:
Dörfler marking criterion, adaptive finite element method, optimal complexity.
2010 Mathematics Subject Classification:
65N50, 65N30, 68Q25.
Acknowledgement. The authors thankfully acknowledge support by the Austrian Science Fund (FWF) through the doctoral school Dissipation and dispersion in nonlinear PDEs (grant W1245) and through the research project Optimal adaptivity for BEM and FEM-BEM coupling (grant P27005).
1. Introduction
In the last decade, the mathematical understanding of adaptive finite element methods (AFEM) has matured. For many elliptic model problems, one can mathematically prove that AFEM leads to optimal convergence behavior; see, e.g., [Dör96, MNS00, BDD04, Ste07, CKNS08] for some of the seminal works for symmetric problems, [MN05, CN12, FFP14] for the extension to nonsymmetric problems, or to [CFPP14] for some recent review on the state of the art.
Starting from an initial mesh , the usual AFEM algorithms iterate the loop
[TABLE]
The latter generates a sequence of successively refined meshes together with the associated FEM solutions and a posteriori error estimators , where the index is the step counter of the adaptive loop. Formally, the algorithm reads as follows: For all , iterate the following steps:
Compute the FEM solution corresponding to .
Compute certain refinement indicators for all .
Determine a subset of elements for refinement.
Generate a new mesh by refinement of (at least) all marked elements.
Usually, the set from then contains the elements with the largest contributions . Often (and, in particular, for the analysis of rate optimality [CFPP14]), the Dörfler marking criterion [Dör96] is used: Given , construct such that
[TABLE]
i.e., the marked elements control a fixed percentage of the overall error estimator. Clearly, one aims to choose the set with as few elements as possible.
As far as convergence of AFEM is concerned, also other marking criteria can be considered [MSV08, Sie11]. Current proofs of rate optimality of AFEM, however, rely on the (quasi-) minimal Dörfler marking (2), where the set has to be chosen with minimal cardinality (at least up to some -independent generic constant); see [CFPP14]. Moreover, when the focus comes to the overall computational cost of AFEM, it is important that all steps of the adaptive algorithm can be performed at linear cost with respect to the number of elements . This is usually a reasonable assumption if employs iterative solvers like PCG [FHPS19] or multigrid [Ste07], and it requires appropriate data structures for and .
If aims for a set , which satisfies (3) with minimal cardinality, then linear cost is less obvious: The work [Dör96] notes that a possible strategy is to sort the indicators, which, however, results in log-linear costs. Instead, the work [Ste07] employs an approximate sorting by binning. While this leads to linear costs, the resulting set has only minimal cardinality up to a multiplicative factor , and [Ste07, Section 5] notes:
Selecting that satisfies (2) with true minimal cardinality would require sorting all by the values of , which takes operations.
The present work bridges the approaches of [Dör96, Ste07] and proves that the latter statement is wrong: Based on ideas of the (Quick-) Selection algorithm [Hoa61], we present a linear-cost algorithm for , which provides a set , which satisfies the Dörfler criterion (2) with minimal cardinality.
The outline of the present work reads as follows: In Section 2, we formulate the Dörfler marking and briefly discuss the algorithms from [Dör96, Ste07]. In Section 3, we present and analyze our new approach for named QuickMark. Section 3.4 concludes with a C++11 STL-based implementation of the new algorithm.
2. Dörfler marking
2.1. Setting
Let and . Given a vector x\in\mathbb{R}^{N}_{\star}:=\big{\{}x\in\mathbb{R}^{N}\backslash\{0\}\colon x_{j}\geq 0\text{ for all }j\in\mathcal{I}\big{\}}, an index set satisfies the Dörfler criterion, if
[TABLE]
By , we denote the number of elements in . Let N_{\rm min}:=\min\big{\{}\#\mathcal{M}\colon\mathcal{M}\subseteq\mathcal{I}\text{ satisfies }\eqref{eq:doerfler}\big{\}} denote the minimal number of indices which are required to satisfy the Dörfler criterion (3). We note that the minimizing set is not unique in general, e.g., if for all and .
**Remark 1. ***For , the set of minimal cardinality satisfying (3) is unique and given by . Clearly, this set can be determined at linear costs. *
We say that an algorithm realizes the minimal Dörfler marking, if, for all , for all , and for all , the algorithm constructs a set , which satisfies (3) with . We say that an algorithm realizes the quasi-minimal Dörfler marking, if, for all , there exists a constant such that, for all and for all , the algorithm constructs a set , which satisfies (3) with .
For current proofs of rate optimality of AFEM, the marking algorithm has to realize the quasi-minimal Dörfler marking [CFPP14], while available results on optimal computational costs require also that the marking step has linear costs [Ste07, GHPS18, FHPS19].
2.2. Minimal Dörfler marking based on sorting
It is already noted in [Dör96] that a set , which satisfies (3) as well as , can easily be constructed by sorting.
*Algorithm 2. *** For the setting from Section 2.1, perform the following steps (i)–(iii):
- (i)
Determine a permutation such that .
- (ii)
Compute .
- (iii)
Determine the minimal index such that .
Output:* *
In practice, step (i) of Algorithm 2.2 will be performed by sorting the vector . This leads to operations for, e.g., the Introsort algorithm [Mus97].
**Proposition 3. ***The set generated by Algorithm 2.2 satisfies (3) as well as , i.e., Algorithm 2.2 realizes the minimal Dörfler marking. Up to step (i), the computational cost of Algorithm 2.2 is linear. *
Proof.
Let satisfy (3) with . By construction of , it holds that
[TABLE]
Hence, we see that . This implies that . It is obvious that step (ii)–(iii) of Algorithm 2.2 have linear cost . ∎
2.3. Dörfler marking without sorting
To avoid sorting, the work [Dör96] proposes (a variant of) the following algorithm; see [Dör96, Section 5.2].
*Algorithm 4. *** For the setting from Section 2.1 and given , perform the following steps (i)–(vi):
- (i)
Initialize , and for all .
- (ii)
Compute and .
- (iii)
For , iterate the following steps:
- (iv)
For all with i\not\in\big{\{}\pi(j)\colon j=1,\dots,n\big{\}}, iterate the following steps:
- (v)
if , then define and update
- (vi)
if , then terminate.
Output:* *
**Remark 5. ***The algorithm proposed in [Dör96, Section 5.2] has the stopping criterion (vi) as part of step (iii), i.e., steps (iv)–(v) are iterated, until . If is constant, i.e., for all , then this variant leads to for all and hence does not realize quasi-minimal Dörfler marking. Our formulation of Algorithm 2.3 excludes such a simple counterexample. *
**Proposition 6. *** Algorithm 2.3 terminates after finitely many steps. The computational cost of Algorithm 2.3 is . The set generated by Algorithm 2.3 realizes (3), but it is not quasi-minimal in general. *
Proof.
Steps (i)–(ii) have linear costs . Obviously, if in step (vi) the sum is rather updated than recomputed, step (iii)–(vi) lead to total costs for Algorithm 2.3. To see that satisfies (3), note that (at latest) for , it holds that and hence is satisfied for all . It only remains to show that Algorithm 2.3 does not realize the quasi-minimal Dörfler marking.
Let and be arbitrary. We aim to show that for any constant , there exist and such that the set generated by Algorithm 2.3 satisfies . Without loss of generality, we may assume . The idea now is the following:
- •
For some and , define the vector of the form
[TABLE]
- •
Then, choose and such that satisfies (3), but neither nor do.
- •
If moreover and are chosen such that the condition in Step (v) of Algorithm 2.3 is not satisfied for any of the indices and any of the loop iterations of Step (iii), then for the last loop iteration , starting from the index , all indices will be added to until (3) is satisfied.
- •
Now, if is chosen small enough, then the set returned by Algorithm 2.3 will be a superset of , i.e., .
- •
Since satisfies (3), it holds that and hence .
It remains to define , and such that the desired properties hold. Define
[TABLE]
Note that implies that . First, note that
[TABLE]
Hence, satisfies (3) and therefore . Next, we claim that Algorithm 2.3 will construct a set , which thus contains more than indices: Observe that
[TABLE]
This proves that . Together with , this implies that the condition in Step (v) of Algorithm 2.3 will not be satisfied for any before the last iteration of the loop in Step (iii) of Algorithm 2.3 (i.e., before ). Thus, for , we have , , and for all . Note, that
[TABLE]
Consequently, after the last iteration of the -loop it holds that for all and . Hence, the set returned by Algorithm 2.3 satisfies . This concludes the proof. ∎
2.4. Quasi-minimal Dörfler marking with linear complexity by binning
The following strategy has been proposed in the seminal work [Ste07], which gave the first optimality proof for a standard AFEM loop of type (1) for the 2D Poisson problem. The main observation is the following: If the reduction of the threshold in step (v) of Algorithm 2.3 is done by multiplication instead of subtraction, then the resulting algorithm satisfies the quasi-minimal Dörfler marking. While [Ste07, Section 5] outlines the proposed strategy for the choice , we work out all details in our proof of Proposition 2.4.
*Algorithm 7. *** For the setting from Section 2.1 and given , perform the following steps (i)–(v):
- (i)
Compute and .
- (ii)
Determine the minimal with .
- (iii)
For , fill bins \mathcal{B}_{k}:=\big{\{}j\in\mathcal{I}\colon\nu^{k+1}<x_{j}/M\leq\nu^{k}\big{\}} and define .
- (iv)
This yields a permutation such that
* for all and with .*
- (v)
Determine the minimal index such that .
Output:* *
**Proposition 8. *** For arbitrary , Algorithm 2.4 terminates after finitely many steps. The constructed set satisfies (3) with . Moreover, a proper implementation of Algorithm 2.4 leads to a total computational cost of \mathcal{O}\big{(}N+K\big{)} with K=\mathcal{O}\big{(}\log_{1/\nu}(N/(1-\theta))\big{)}. *
Proof.
The only non-obvious statement is the bound : For , it holds that and hence
[TABLE]
Since and for all , it follows that satisfies (3). Let be the largest index such that . If no such index exists, i.e., , define . Clearly, it holds that and . Further, there exists such that satisfies (3) with minimal cardinality .
To show , it suffices to show that satisfies . Consider . Then, , , and with , it holds that
[TABLE]
It immediately follows, that . Altogether, satisfies (3) with . ∎
3. Minimal Dörfler marking with linear complexity
This section constitutes the main contribution of this work. **Theorem 9. *** Dörfler marking with minimal cardinality is a linear complexity problem. More precisely, a call of Algorithm 3.1 below with a vector leads after operations to a set with (3) and . * We prove this main theorem explicitly by introducing the QuickMark algorithm in Section 3.1. The correctness of the QuickMark algorithm is proved in Section 3.2 and the linear complexity of QuickMark is shown in Section 3.3. Section 3.4 concludes with some remarks on the implementation of the algorithm.
3.1. The QuickMark algorithm
Adapting the divide-and-conquer strategy of efficient selection algorithms [Hoa61], we propose a new strategy to determine, at linear costs, a subset with (3) and . The proposed algorithm consists of an initial call (Algorithm 3.1) and the function QuickMark (Algorithm 3.1), which steers the divide-and-conquer strategy based on the subroutines Pivot (Algorithm 3.1) and Partition (Algorithm 3.1).
To improve readability throughout this chapter, whenever a permutation on would be altered by a function, that function instead is written to take the permutation as input and returns as output the new permutation . If a permutation is not changed by a function, it is simply denoted by . Moreover, let represent the identity permutation on , i.e., for all . For an index set define .
*Algorithm 10 (Initial call of QuickMark). *** For the setting from Section 2.1, we perform the following steps (i)–(iv):
- (i)
Initialize the identity permutation .
- (ii)
Define lower index and upper index .
- (iii)
Compute the goal value .
- (iv)
Call QuickMark
Output:* *
Analogously to selection algorithms [Hoa61], the QuickMark algorithm is based on the subroutine Partition, where elements are essentially separated into two classes: Those elements with smaller value than the pivot element, and those with greater value than the pivot element. Then, the algorithm decides, which of the two classes is not to be inspected further.
*Algorithm 11 ( QuickMark). *** Input: Vector , permutation on , goal value , lower and upper indices .
- (i)
Determine a pivot index Pivot.
- (ii)
Determine a new permutation via Partition.
- (iii)
Compute the sum of the greatest elements .
- (iv)
If , then return QuickMark
- (v)
Else, if , then return
- (vi)
Else return QuickMark
Output:* Permutation of and index . *
The Pivot subroutine should determine a feasible pivot element of a given (sub-) array. While the concrete choice of the pivot strategy is irrelevant for the correctness of the procedure, it is the decisive factor for the computational complexity of the divide-and-conquer strategy. For now, we consider an arbitrarily (e.g., randomly) chosen . While in Section 3.2 correctness of the algorithm is proved independently of the concrete pivot strategy, in Section 3.3 we propose a pivot strategy that leads — even in the worst case — to linear complexity of Algorithm 3.1.
*Algorithm 12 ( Pivot). *** Input: Vector , permutation on , lower and upper indices .
- (i)
Use to determine a pivot index .
Output:* Pivot index . *
For a given pivot element, the Partition subroutine reorganizes the elements of an (sub-) array depending on whether they are greater than, smaller than, or equal to the pivot.
*Algorithm 13 ( = Partition). *** Input: Vector , permutation on , lower and upper indices , pivot index .
- (i)
Compute a permutation on together with the unique indices and such that the following three implications hold true for all :
If , then .
If , then .
If , then .
- (ii)
Define
Output:* Permutation of together with indices and . *
The following remark collects some important observations (4)–(5) about the state of and in Algorithm 3.1. The validity of (4) will be shown in Proposition 3.2.1 in Section 3.2. The properties (5) follow directly from Algorithm 3.1. **Remark 14. **When Partition (Algorithm 3.1) is called in step (ii) of QuickMark (Algorithm 3.1), the permutation and the indices satisfy
[TABLE]
This is illustrated in Figure 1.
The permutation defined in step (ii) of Algorithm 3.1 differs from only at the indices . Consequently, (4a)–(4b) are preserved by . With the indices returned by Algorithm 3.1 and the pivot index, it additionally holds that
[TABLE]
*This is illustrated in Figure 2. *
3.2. Correctness of the QuickMark algorithm
We consider , permutations on , indices with , and a value . Proving the correctness of QuickMark (Algorithm 3.1) is organized into three steps: In Section 3.2.1 we verify some essential properties satisfied by the input parameters of calls to Algorithm 3.1. Section 3.2.2 introduces auxiliary subproblems generated and solved by Algorithm 3.1 and gives insight on the idea behind the QuickMark strategy. Termination of Algorithm 3.1 is investigated in Section 3.2.3, where the correctness is proved.
3.2.1. Admissible calls to QuickMark
We consider the following crucial properties, which will be shown to be always satisfied in Proposition 3.2.1. *Definition 15. *** A call QuickMark to Algorithm 3.1 is called admissible, if the inputs satisfy the following conditions (a)–(b):
- (a)
It holds that
[TABLE]
- (b)
It holds that
[TABLE]
In fact, the following proposition shows that recursive calls of QuickMark preserve the admissibility conditions. **Proposition 16. *** If QuickMark is initially called by Algorithm 3.1(iv), then each subsequent recursive call QuickMark from step (iv) or (vi) of Algorithm 3.1 is admissible. *
Proof.
The statement follows directly by induction. First, we show that the initial call QuickMark of Algorithm 3.1 initiated by Algorithm 3.1(iv) with the inputs , , , , and is admissible: Since and , Definition 3.2.1(a) contains only statements about indices in the empty set and is therefore satisfied. Definition 3.2.1(b) follows from , , and the definition of .
For the induction step, consider an admissible call QuickMark of Algorithm 3.1. We show that a potential subsequent call QuickMark initiated by either Algorithm 3.1(iv) (i.e., , , ), or by Algorithm 3.1(vi) (i.e., , , ), is also admissible: By (a), (b) we refer to the assumption, i.e., the admissibility conditions of Definition 3.2.1 satisfied by QuickMark. We aim to show the corresponding admissibility conditions of Definition 3.2.1 for the call QuickMark, which will be denoted by , .
Recall, that in either case (step (iv) or step (vi) in Algorithm 3.1), differs from only on the index set . Therefore, in both cases follows from (5a)–(5c) and (a). If recursion relies on Algorithm 3.1(iv), then , , and . Hence,
[TABLE]
proves . If recursion relies on Algorithm 3.1(vi), then , , and
[TABLE]
Combining (b) and the last estimate yields for that
[TABLE]
This shows . ∎
3.2.2. Subproblems generated by QuickMark
To analyze Algorithm 3.1, we introduce some auxiliary notation. In particular, the symbol will be used differently than in Section 2.1. The connection between the two notations is clarified in Remark 3.2.2.
By , we denote the power set of . For any admissible call QuickMark to Algorithm 3.1, let consist of all such that
[TABLE]
The following remark follows immediately from (9a)–(9b) and connects the introduced notation to the Dörfler marking criterion (3) from Section 2.1. **Remark 17. *** For arbitrary , the set satisfies (3) with minimal cardinality . *
Later in Section 3.2.3, we will prove that QuickMark called by Algorithm 3.1 determines a set . The core idea behind the proof is the observation that for an admissible call QuickMark, the set can be written as
[TABLE]
Hence, an admissible call QuickMark to Algorithm 3.1 either determines a set and terminates in step (v), or it initiates another admissible recursive call denoted by QuickMark in step (iv) or step (vi), where , i.e., the problem is reduced to a strict subproblem.
First, we will show, that all occurring subproblems of finding are well-posed. In fact, for an admissible call QuickMark the set is always nonempty and all attain the same minimum in . *Lemma 18. *** Let QuickMark be an admissible call to Algorithm 3.1. Then, . Moreover, the definition
[TABLE]
*is independent of the concrete choice of . *
Proof.
To show that , we explicitly construct some : Starting with , for define
[TABLE]
i.e., is generated by extracting the index with the smallest value in from . By construction, (9a) holds for all , . Further, the values are monotonically decreasing in . Since , the admissibility (7) of implies that
[TABLE]
Consequently, there exists a unique such that
[TABLE]
By construction, for all (and in particular for ) it holds that
[TABLE]
Hence, combining the last two estimates shows that also satisfies (9b) and thus . This proves .
To show that the definition (10) is independent of , we claim that
[TABLE]
To prove this claim, we argue by contradiction and assume and, without loss of generality, . Hence, we have and
[TABLE]
If there exists , then (9a) gives that . This contradicts the last estimate and hence proves that . Therefore, we deduce that . Using the second inequality in (9b) for and then using the first inequality in (9b) for , we see that
[TABLE]
This contradiction implies that and concludes the proof. ∎
3.2.3. Termination of QuickMark
For any admissible call QuickMark of Algorithm 3.1, exactly one of three cases — recursion by step (iv), termination by step (v), or recursion by step (vi) — applies. The next lemma connects the termination in step (v) directly to the pivot index chosen in step (i).
**Lemma 19. *** Let QuickMark be an admissible call to Algorithm 3.1. Then, Algorithm 3.1 terminates with step (v), if and only if the pivot index from step (i) satisfies . *
Proof.
After step (ii) of Algorithm 3.1, it holds that and hence .
First, suppose that Algorithm 3.1 terminates with step (v), i.e.,
[TABLE]
[TABLE]
By definition (10) and (5a)–(5b), it follows that .
Conversely, suppose that and let be arbitrary. Then, (5a)–(5c) and imply that
[TABLE]
Therefore, (9b) leads to
[TABLE]
Consequently, Algorithm 3.1 terminates in step (v). ∎
Whenever an admissible call of Algorithm 3.1 terminates in step (v), a solution to the corresponding auxiliary subproblem is provided.
**Lemma 20. *** Let QuickMark be an admissible call to Algorithm 3.1. If QuickMark terminates in step (v), then the output guarantees that . *
Proof.
With from steps (i)–(iii), the termination in Algorithm 3.1(v) implies that
[TABLE]
Obviously, . Together with (5), this shows that returned in Algorithm 3.1(v) satisfies that . Again, (5) implies that satisfies (9a). It remains to show (9b): The definition of and the choice of show that for all it holds
[TABLE]
Similarly, we see that
[TABLE]
Consequently, satisfies (9b) and we conclude that . ∎
Algorithm 3.1 always terminates and provides a set of minimal cardinality satisfying the Dörfler marking criterion.
**Theorem 21. ***If initially called by Algorithm 3.1, then QuickMark terminates after finitely many operations and the output guarantees that satisfies the Dörfler criterion (3) with minimal cardinality. *
Proof.
At latest the -st recursive call of QuickMark terminates in step (v) of Algorithm 3.1: Proposition 3.2.1 shows that all (subsequent) calls of QuickMark are admissible. For any recursive call QuickMark initiated by step (iv) or step (vi) of QuickMark, it holds that . Therefore, if none of the first recursive calls of QuickMark terminates in step (v) of Algorithm 3.1, for the -st recursive call denoted by QuickMark it holds that . Consequently, for this call the pivot index is chosen as in step (i) of Algorithm 3.1. Using Lemma 3.2.2, the admissibility of QuickMark implies that . We infer that and thus
[TABLE]
Hence, Lemma 3.2.3 implies termination of QuickMark in Algorithm 3.1(v).
It remains to show that satisfies (3) with minimal cardinality. In view of Remark 3.2.2, we will show that : Suppose that are obtained by Algorithm 3.1(iv). Denote the last recursive call of Algorithm 3.1 by QuickMark. By Proposition 3.2.1, this call is admissible and differs from only for the indices .
By Lemma 3.2.3, it holds that . Thus, the partial ordering (6a)–(6b) shows that
[TABLE]
By Definition 3.2.1(b), it holds that
[TABLE]
Since , condition (9b) reads
[TABLE]
Using the partial ordering (6a) and adding to the last estimate, we get
[TABLE]
Consequently, (11)–(12) show that . ∎
3.3. Computational complexity of the QuickMark algorithm
Exploiting the fact that selection problems can always be solved in linear time [BPT*+*73], we show that the pivoting strategy in Algorithm 3.1 can be chosen such that, for any and any , Algorithm 3.1 always terminates after operations. Consider choosing the median of as pivot element. *Algorithm 22 ( Median). *** Input: Vector , permutation on , lower and upper index .
- (i)
Determine an index such that
[TABLE]
Output:* Median index . *
According to [BPT*+*73], Algorithm 3.3 can be implemented such that it always terminates in linear time . This leads to the following theorem. **Theorem 23. *** If Pivot is replaced by Median in Algorithm 3.1(i), then, for any and any , Algorithm 3.1 terminates after operations. In particular, the multiplicative constant hidden in the Landau notation is generic and independent of and . *
Proof.
Obviously, steps (i)–(iii) of Algorithm 3.1 can be realized using operations. Moreover, the permutation can be represented by additionally storing an array containing indices. It remains to show that the call to QuickMark in step (iv) terminates at linear costs .
Consider a (possibly recursive) call of QuickMark. The median (-index) of with respect to the indices of Algorithm 3.1(i) can be determined at linear cost ; see [BPT*+*73, Theorem 1]. The partition in Algorithm 3.1(ii) can be determined at linear cost . In particular, this can easily be implemented by temporarily storing not more than additional indices . Algorithm 3.1(iii) is of cost and steps (iv)–(vi) of Algorithm 3.1 are of constant cost plus, in the case of step (iv) or step (vi), the cost of the recursive call on at most indices; see (13). We have shown that for a generic constant , the costs for an iteration of Algorithm 3.1 are bounded by plus the costs of a potential recursive call.
Now, denote the computational costs of a call of QuickMark by , where is the number of elements under consideration. Then, due to the choice Pivot Median, using (13b) in Algorithm 3.1(iv), or (13a) in Algorithm 3.1(vi), respectively, it follows inductively that
[TABLE]
For the choice Pivot Median, we conclude that Algorithm 3.1, and hence Algorithm 3.1, always terminates at linear costs. ∎
**Remark 24. *** (i) In the complexity estimate of Theorem 3.3 the dependency on is avoided due to the choice of Median as pivoting strategy. Other pivoting strategies may lead to a hidden constant depending on .
(ii) If Algorithm 3.1(i) chooses the pivot index always randomly, then the algorithm might perform faster on average. However, this would lead to quadratic worst-case performance of Algorithm 3.1.
(iii) Theorem 3.3 is proved for choosing the -quantile, i.e., the median element is the pivot (Algorithm 3.3). If any other fixed quantile is chosen as the pivot, then Theorem 3.3 still holds true.
(iv) If for fixed one chooses pivoting by the -quantile rather than by the median in Theorem 3.3, then a call of QuickMark() potentially leads to a recursive call in step (iv) or step (vi) of Algorithm 3.1 on up to indices. Hence, the computational costs of Algorithm 3.1 with this pivoting strategy called on indices can then be estimated by*
[TABLE]
*Obviously, choosing the median as pivot (i.e., ) optimizes this estimate. *
3.4. Remarks on the implementation of QuickMark
Up to now, we focused on the idea and the theoretical aspects of the QuickMark algorithm, namely verifying Theorem 3. We conclude this section by discussing some adaptions to the algorithm as it is presented in Section 3.1, in order to arrive at an efficient competitive C++11 implementation using routines provided by the standard library. Ultimately, we compare the performance of our implementation to an implementation of Algorithm 2.2 based on the sorting routine provided by the standard library.
The following observations lead to an efficient QuickMark implementation relying on routines provided by the standard library. **Remark 25. *** (i) The data structure for given refinement indicators for all is usually a vector eta, where eta[j] refers to the estimated error for the -th element in the data structure representing the mesh . To preserve this relation, one aims to avoid manipulating (i.e., reordering) eta.
(ii) QuickMark as formulated in Algorithm 3.1 avoids manipulation of eta by operating on a permutation only. Hence, in a straight forward implementation of Algorithm 3.1, which uses a permutation to access elements of the array , data is not accessed contiguously and a considerable performance penalty is introduced.
(iii) Hence, to achieve a more efficient implementation of QuickMark, one would rather alter the algorithm to operate on (and modify) a temporary copy x of eta to determine the value . The desired set is then given by the union of and a proper subset of .
(iv) For the ease of presentation, in Partition (Algorithm 3.1) a partition into three subarrays — elements strictly greater than, equal to, and strictly smaller than the pivot element — is demanded. In view of using standard library partition implementations, we note that this is not necessary: It suffices to partition into two subarrays: One with elements greater than or equal to the pivot element, the pivot element itself, and one with elements smaller than or equal to the pivot element. Then, as long as it is ensured, that other elements with the same value as the pivot element are distributed evenly among the two subarrays, Theorem 3.3 holds true.
(v) When using a partition based algorithm to determine a quantile, e.g., the median element, as the pivot element, the subarray is already partitioned after Algorithm 3.1(i). Hence, Algorithm 3.1(ii) can be skipped. *
Using headers <vector>, <iterator>, <algorithm>, <functional>, and <numeric>, a C++11 implementation of QuickMark adapted to the observations of Remark 3.4 relying on routines from the standard library could read as follows.
Passing refinement indicators for all (eta) and an adaptivity parameter (theta) to the following adaption of Algorithm 3.1, then yields the desired value , such that the set is readily obtained; see Remark 3.4(iii).
**Remark 26. ***While QuickMark can be implemented such that its complexity is linear even in the worst case, the worst-case complexity of the given C++ function xStarKernel is (standard library-) implementation dependent:
The C++ standard requires std::nth_element to be of linear complexity only on average, while lacking any worst-case restriction [ISO17]. A quality introspective selection implementation of std::nth_element could be realized as proposed in [Mus97]: As fast as the Quickselect algorithm [Hoa61] in practice, maintaining linear worst-case complexity by relying on the median of medians algorithm from [BPT*+*73] as fallback strategy. *
We conclude by comparing the performance of the C++ standard library implementation std::sort to our implementation xStarKernel above. This is reasonable, since those two routines are the core components of Algorithm 2.2 and Algorithm 3.1 (adapted to the observations of Remark 3.4), respectively. The completing components of Algorithm 2.2 and Algorithm 3.1 are very similar for both approaches and in particular, make up for only a small fraction of the overall computational cost of the respective algorithm.
We consider adaptivity parameters and vectors of length . For each combination of and we generate vectors eta of length filled with uniformly distributed pseudorandom double-precision values between [math] and . The core routines std::sort and xStarKernel are called on (copies of) each of these vectors and the computational times are measured. The sources were compiled with GNU compiler g++ version 5.5.0, optimization flag -O3, and -std=c++11 enabled. All computations were performed on a machine with of RAM and an Intel Core i7-6700 CPU [Int] with a base frequency of .
For all test cases , the measured times for the fastest (Table 1), average (Table 2) and slowest (Table 3) run out of runs is given. To emphasize the improved complexity of Algorithm 3.1 over Algorithm 2.2, the measurements for are visualized in Figure 3: While the computational time spent per element increases logarithmically with the problem size for std::sort, it remains constant for xStarKernel. Hence, as expected, the QuickMark strategy clearly outperforms the approach of Algorithm 2.2 based on sorting. Moreover, while the measured time behaves like for sorting, it only grows linearly with respect to the problem size for QuickMark as predicted by Theorem 3.3. In accordance with Theorem 3.3, different values of do not influence the performance of the algorithm.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[BDD 04] Peter Binev, Wolfgang Dahmen, and Ron De Vore. Adaptive finite element methods with convergence rates. Numer. Math. , 97(2):219–268, 2004.
- 2[BPT + 73] Manuel Blum, Vaughan Pratt, Robert E. Tarjan, Robert W. Floyd, and Ronald L. Rivest. Time bounds for selection. J. Comput. System Sci. , 7:448–461, 1973. Fourth Annual ACM Symposium on the Theory of Computing (Denver, Colo., 1972).
- 3[CFPP 14] Carsten Carstensen, Michael Feischl, Marcus Page, and Dirk Praetorius. Axioms of adaptivity. Comput. Math. Appl. , 67(6):1195–1253, 2014.
- 4[CKNS 08] J. Manuel Cascon, Christian Kreuzer, Ricardo H. Nochetto, and Kunibert G. Siebert. Quasi-optimal convergence rate for an adaptive finite element method. SIAM J. Numer. Anal. , 46(5):2524–2550, 2008.
- 5[CN 12] J. Manuel Cascón and Ricardo H. Nochetto. Quasioptimal cardinality of AFEM driven by nonresidual estimators. IMA J. Numer. Anal. , 32(1):1–29, 2012.
- 6[Dör 96] Willy Dörfler. A convergent adaptive algorithm for Poisson’s equation. SIAM J. Numer. Anal. , 33(3):1106–1124, 1996.
- 7[FFP 14] Michael Feischl, Thomas Führer, and Dirk Praetorius. Adaptive FEM with optimal convergence rates for a certain class of nonsymmetric and possibly nonlinear problems. SIAM J. Numer. Anal. , 52(2):601–625, 2014.
- 8[FHPS 19] Thomas Führer, Alexander Haberl, Dirk Praetorius, and Stefan Schimanko. Adaptive BEM with inexact PCG solver yields almost optimal computational costs. Numer. Math. , 141(4):967–1008, 2019.
