Approximation of high-dimensional periodic functions with Fourier-based   methods

Daniel Potts; Michael Schmischke

arXiv:1907.11412·math.NA·January 31, 2022

Approximation of high-dimensional periodic functions with Fourier-based methods

Daniel Potts, Michael Schmischke

PDF

TL;DR

This paper introduces a Fourier-based approximation method for high-dimensional periodic functions using multivariate ANOVA decomposition, emphasizing sparsity and efficient algorithms for scattered data and black-box scenarios.

Contribution

It develops a novel high-dimensional approximation technique leveraging ANOVA decomposition, smoothness inheritance, and specialized Fourier algorithms for improved efficiency.

Findings

01

Effective importance ranking of dimensions and interactions.

02

Utilization of NFFT for fast Fourier matrix multiplication.

03

Properties of rank-1 lattices for black-box approximation.

Abstract

In this paper we propose an approximation method for high-dimensional $1$ -periodic functions based on the multivariate ANOVA decomposition. We provide an analysis on the classical ANOVA decomposition on the torus and prove some important properties such as the inheritance of smoothness for Sobolev type spaces and the weighted Wiener algebra. We exploit special kinds of sparsity in the ANOVA decomposition with the aim to approximate a function in a scattered data or black-box approximation scenario. This method allows us to simultaneously achieve an importance ranking on dimensions and dimension interactions which is referred to as attribute ranking in some applications. In scattered data approximation we rely on a special algorithm based on the non-equispaced fast Fourier transform (or NFFT) for fast multiplication with arising Fourier matrices. For black-box approximation we choose the…

Tables6

Table 1. Table 1: Results of detection step for important ANOVA terms with M = 2.5 ⋅ 10 6 𝑀 ⋅ 2.5 superscript 10 6 M=2.5\cdot 10^{6} uniformly distributed nodes ( 𝑵 = [ N 1 , N 2 , N 3 ] 𝑵 subscript 𝑁 1 subscript 𝑁 2 subscript 𝑁 3 \bm{N}=[N_{1},N_{2},N_{3}] ).

	size of index sets		relative errors
	$𝑵$	$\| I (U_{d_{s}}) \|$	$ε_{ℓ_{2}}$	$ε_{L_{2}}$	$I^{(1)}, I^{(2)}, I^{(3)}$
1	[256, 32, 8]	65704	$4.7 \cdot 10^{- 3}$	$4.8 \cdot 10^{- 3}$	$(0.0, 0.021)$
					$(3.0 \cdot 10^{- 8}, 0.019)$
					$(1.2 \cdot 10^{- 8}, 0.026)$
2	[256, 32, 16]	320392	$2.1 \cdot 10^{- 3}$	$2.4 \cdot 10^{- 3}$	$(0.0, 0.021)$
					$(7.2 \cdot 10^{- 9}, 0.019)$
					$(2.5 \cdot 10^{- 8}, 0.026)$
3	[256, 32, 32]	2539336	$2.6 \cdot 10^{- 2}$	$2.8 \cdot 10^{- 2}$	$(0.0, 0.016)$
					$(8.3 \cdot 10^{- 5}, 0.015)$
					$(2.5 \cdot 10^{- 3}, 0.023)$
4	[256, 64, 8]	173992	$4.4 \cdot 10^{- 3}$	$4.7 \cdot 10^{- 3}$	$(0.0, 0.021)$
					$(1.1 \cdot 10^{- 7}, 0.019)$
					$(1.1 \cdot 10^{- 8}, 0.026)$
5	[256, 64, 16]	428680	$1.6 \cdot 10^{- 3}$	$1.9 \cdot 10^{- 3}$	$(0.0, 0.021)$
					$(1.8 \cdot 10^{- 8}, 0.019)$
					$(1.6 \cdot 10^{- 8}, 0.026)$
6	[256, 64, 32]	2647624	$2.5 \cdot 10^{- 2}$	$3.2 \cdot 10^{- 2}$	$(0.0, 0.015)$
					$(4.0 \cdot 10^{- 4}, 0.015)$
					$(2.9 \cdot 10^{- 3}, 0.022)$
7	[512, 64, 8]	176296	$4.4 \cdot 10^{- 3}$	$4.7 \cdot 10^{- 3}$	$(0.0, 0.021)$
					$(1.1 \cdot 10^{- 7}, 0.019)$
					$(1.1 \cdot 10^{- 8}, 0.026)$
8	[512, 64, 16]	430984	$1.6 \cdot 10^{- 3}$	$1.9 \cdot 10^{- 3}$	$(0.0, 0.021)$
					$(1.8 \cdot 10^{- 8}, 0.019)$
					$(1.6 \cdot 10^{- 8}, 0.026)$
9	[512, 64, 32]	2649928	$2.5 \cdot 10^{- 2}$	$3.2 \cdot 10^{- 2}$	$(0.0, 0.015)$
					$(4.0 \cdot 10^{- 4}, 0.015)$
					$(2.9 \cdot 10^{- 3}, 0.022)$

Table 2. Table 2: Results for approximation with active set U ∗ superscript 𝑈 ∗ U^{\ast} and M = 2.5 ⋅ 10 6 𝑀 ⋅ 2.5 superscript 10 6 M=2.5\cdot 10^{6} uniformly distributed nodes ( 𝑵 = [ N 1 , N 2 , N 3 ] 𝑵 subscript 𝑁 1 subscript 𝑁 2 subscript 𝑁 3 \bm{N}=[N_{1},N_{2},N_{3}] ).

	size of index sets		relative errors
	$𝑵$	$\| I (U^{*}) \|$	$ε_{ℓ_{2}}$	$ε_{L_{2}}$
1	[1024, 64, 64]	283069	$5.6 \cdot 10^{- 4}$	$6.3 \cdot 10^{- 4}$
2	[1024, 128, 32]	135773	$6.0 \cdot 10^{- 4}$	$6.4 \cdot 10^{- 4}$
4	[1024, 128, 64]	356029	$2.7 \cdot 10^{- 4}$	$3.1 \cdot 10^{- 4}$
5	[1024, 256, 64]	649405	$2.0 \cdot 10^{- 4}$	$2.7 \cdot 10^{- 4}$

Table 3. Table 3: Results of detection step for important ANOVA terms with M = 2.5 ⋅ 10 6 𝑀 ⋅ 2.5 superscript 10 6 M=2.5\cdot 10^{6} uniformly distributed nodes and superposition threshold d s = 2 subscript 𝑑 𝑠 2 d_{s}=2 , ( 𝑵 = [ N 1 , N 2 ] 𝑵 subscript 𝑁 1 subscript 𝑁 2 \bm{N}=[N_{1},N_{2}] ).

	size of index sets		relative errors
	$𝑵$	$\| I (U_{d_{s}}) \|$	$ε_{ℓ_{2}}$	$ε_{L_{2}}$	$I^{(1)}, I^{(2)}$
1	[256, 16]	10396	$9.4 \cdot 10^{- 2}$	$9.4 \cdot 10^{- 2}$	$(0.0, 0.021)$
					$(3.0 \cdot 10^{- 6}, 0.020)$
2	[256, 32]	36892	$9.3 \cdot 10^{- 2}$	$9.4 \cdot 10^{- 2}$	$(0.0, 0.021)$
					$(3.0 \cdot 10^{- 6}, 0.020)$
3	[256, 64]	145180	$9.1 \cdot 10^{- 2}$	$9.6 \cdot 10^{- 2}$	$(0.0, 0.021)$
					$(4.8 \cdot 10^{- 5}, 0.020)$
4	[256, 128]	582940	$8.2 \cdot 10^{- 2}$	$1.1 \cdot 10^{- 1}$	$(0.0, 0.021)$
					$(2.3 \cdot 10^{- 4}, 0.020)$

Table 4. Table 4: Approximation results for active set U + superscript 𝑈 U^{+} with M = 2.5 ⋅ 10 6 𝑀 ⋅ 2.5 superscript 10 6 M=2.5\cdot 10^{6} uniformly distributed nodes ( 𝑵 = [ N 1 , N 2 ] 𝑵 subscript 𝑁 1 subscript 𝑁 2 \bm{N}=[N_{1},N_{2}] ).

	size of index sets		relative errors
	$𝑵$	$\| I (U^{+}) \|$	$ε_{ℓ_{2}}$	$ε_{L_{2}}$
1	[1024, 16]	10558	$9.3 \cdot 10^{- 2}$	$9.3 \cdot 10^{- 2}$
2	[1024, 32]	14974	$9.3 \cdot 10^{- 2}$	$9.3 \cdot 10^{- 2}$
3	[1024, 64]	33022	$9.3 \cdot 10^{- 2}$	$9.4 \cdot 10^{- 2}$
4	[1024, 128]	105982	$9.1 \cdot 10^{- 2}$	$9.5 \cdot 10^{- 2}$

Table 5. Table 5: Results of detection step for important ANOVA terms ( 𝑵 = [ N 1 , N 2 , N 3 ] 𝑵 subscript 𝑁 1 subscript 𝑁 2 subscript 𝑁 3 \bm{N}=[N_{1},N_{2},N_{3}] ).

$𝑵$	$\| I (U_{d_{s}}) \|$	$ε_{ℓ_{2}}$	$ε_{L_{2}}$	$M$	$I^{(1)}, I^{(2)}, I^{(3)}$
size of index sets		relative errors
$[10^{2}, 10^{2}, 10^{2}]$	3481	$2.8 \cdot 10^{- 2}$	$3.0 \cdot 10^{- 2}$	47351	$(0.0, 0.021)$
					$(3.4 \cdot 10^{- 5}, 0.019)$
					$(5.7 \cdot 10^{- 5}, 0.025)$
$[10^{3}, 10^{3}, 10^{3}]$	11203	$1.0 \cdot 10^{- 2}$	$1.0 \cdot 10^{- 2}$	490277	$(0.0, 0.021)$
					$(1.7 \cdot 10^{- 7}, 0.019)$
					$(5.7 \cdot 10^{- 7}, 0.026)$
$[10^{4}, 10^{4}, 10^{3}]$	16891	$7.0 \cdot 10^{- 3}$	$7.1 \cdot 10^{- 3}$	1114489	$(0.0, 0.021)$
					$(3.5 \cdot 10^{- 10}, 0.019)$
					$(5.3 \cdot 10^{- 8}, 0.026)$
$[10^{5}, 10^{4}, 10^{3}]$	17341	$7.0 \cdot 10^{- 3}$	$7.0 \cdot 10^{- 3}$	2349307	$(0.0, 0.021)$
					$(1.7 \cdot 10^{- 9}, 0.019)$
					$(2.8 \cdot 10^{- 9}, 0.026)$

Table 6. Table 6: Results of detection step for important ANOVA terms ( 𝑵 = [ N 1 , N 2 , N 3 ] 𝑵 subscript 𝑁 1 subscript 𝑁 2 subscript 𝑁 3 \bm{N}=[N_{1},N_{2},N_{3}] ).

$𝑵$	$\| I (U_{d_{s}}) \|$	$ε_{ℓ_{2}}$	$ε_{L_{2}}$	$M$
size of index sets		relative errors
$[10^{4}, 10^{4}, 10^{4}]$	2243	$2.4 \cdot 10^{- 3}$	$2.6 \cdot 10^{- 3}$	157243
$[10^{5}, 10^{5}, 10^{5}]$	6565	$8.3 \cdot 10^{- 4}$	$8.3 \cdot 10^{- 4}$	1346881
$[10^{6}, 10^{5}, 10^{5}]$	7591	$7.7 \cdot 10^{- 4}$	$7.7 \cdot 10^{- 4}$	883391
$[10^{6}, 10^{6}, 10^{5}]$	13495	$5.0 \cdot 10^{- 4}$	$5.0 \cdot 10^{- 4}$	5691109

Equations363

f (x) = k \in Z^{d} \sum c_{k} (f) e^{2 π i k \cdot x},

f (x) = k \in Z^{d} \sum c_{k} (f) e^{2 π i k \cdot x},

S_{I} f (x) = k \in I \sum c_{k} (f) e^{2 π i k \cdot x}

S_{I} f (x) = k \in I \sum c_{k} (f) e^{2 π i k \cdot x}

∥ x ∥_{p} = ⎩ ⎨ ⎧ ∣ {i \in D : x_{i} \neq = 0} ∣ (\sum_{i = 1}^{d} ∣ x_{i} ∣^{p})^{1/ p} max_{i \in D} ∣ x_{i} ∣ : p = 0 : 1 \leq p < \infty : p = \infty

∥ x ∥_{p} = ⎩ ⎨ ⎧ ∣ {i \in D : x_{i} \neq = 0} ∣ (\sum_{i = 1}^{d} ∣ x_{i} ∣^{p})^{1/ p} max_{i \in D} ∣ x_{i} ∣ : p = 0 : 1 \leq p < \infty : p = \infty

Λ (z, M) : = {x_{j} : = \frac{j}{M} z mod 1 : j = 0, 1, \dots, M - 1} .

Λ (z, M) : = {x_{j} : = \frac{j}{M} z mod 1 : j = 0, 1, \dots, M - 1} .

p \in Π_{I} : = span {e^{2 π i k \cdot\circ} : k \in I}

p \in Π_{I} : = span {e^{2 π i k \cdot\circ} : k \in I}

p (x_{j}) = p (\frac{j}{M} z mod 1) = l = 0 \sum M - 1 k \in I k \cdot z \equiv l mod M \sum c_{k} (p) e^{2 π i \frac{j l}{M}} .

p (x_{j}) = p (\frac{j}{M} z mod 1) = l = 0 \sum M - 1 k \in I k \cdot z \equiv l mod M \sum c_{k} (p) e^{2 π i \frac{j l}{M}} .

m \cdot z \neq \equiv 0 mod M \forall m \in D (I) ∖ {0}

m \cdot z \neq \equiv 0 mod M \forall m \in D (I) ∖ {0}

D (I) : = {k - h : k, h \in I}

D (I) : = {k - h : k, h \in I}

c_{k} (p) = \frac{1}{M} j = 0 \sum M p (\frac{j}{M} z mod 1) e^{- 2 π i j \frac{k \cdot z}{M}} .

c_{k} (p) = \frac{1}{M} j = 0 \sum M p (\frac{j}{M} z mod 1) e^{- 2 π i j \frac{k \cdot z}{M}} .

c_{k} (f) \approx \hat{f}_{k} : = \frac{1}{M} j = 0 \sum M f (\frac{j}{M} z mod 1) e^{- 2 π i j \frac{k \cdot z}{M}}

c_{k} (f) \approx \hat{f}_{k} : = \frac{1}{M} j = 0 \sum M f (\frac{j}{M} z mod 1) e^{- 2 π i j \frac{k \cdot z}{M}}

\hat{f}_{k} = c_{k} (f) + h \in Λ^{⊥} (z, M) ∖ {0} \sum c_{k + h} (f)

\hat{f}_{k} = c_{k} (f) + h \in Λ^{⊥} (z, M) ∖ {0} \sum c_{k + h} (f)

Λ^{⊥} (z, M) : = {k \in Z^{d} : k \cdot z \equiv 0 mod M} .

Λ^{⊥} (z, M) : = {k \in Z^{d} : k \cdot z \equiv 0 mod M} .

P_{u} f (x_{u}) : = \int_{T^{d - ∣ u ∣}} f (x) d x_{u^{c}}

P_{u} f (x_{u}) : = \int_{T^{d - ∣ u ∣}} f (x) d x_{u^{c}}

P_{u}^{(d)} : = {k \in Z^{d} : k_{u^{c}} = 0}

P_{u}^{(d)} : = {k \in Z^{d} : k_{u^{c}} = 0}

c_{ℓ} (P_{u} f) = c_{k} (f)

c_{ℓ} (P_{u} f) = c_{k} (f)

c_{ℓ} (P_{u} f)

c_{ℓ} (P_{u} f)

= \int_{T^{d}} f (x) e^{- 2 π i ℓ \cdot x_{u}} d x

= \int_{T^{d}} f (x) e^{- 2 π i k \cdot x} d x = c_{k} (f) .

f_{u} : = P_{u} f - v ⊊ u \sum f_{v} .

f_{u} : = P_{u} f - v ⊊ u \sum f_{v} .

n = a \sum b - 1 (- 1)^{n - a + 1} (n - a b - a) = (- 1)^{b - a} .

n = a \sum b - 1 (- 1)^{n - a + 1} (n - a b - a) = (- 1)^{b - a} .

n = 0 \sum b - a - 1 (- 1)^{n + a + 1} (n b - a) = (- 1)^{b} .

n = 0 \sum b - a - 1 (- 1)^{n + a + 1} (n b - a) = (- 1)^{b} .

n = 0 \sum b - a - 1 (- 1)^{n + a + 1} (n b - a)

n = 0 \sum b - a - 1 (- 1)^{n + a + 1} (n b - a)

= (- 1)^{a + 1} = (- 1 + 1)^{b - a} = 0 n = 0 \sum b - a (- 1)^{n} (n b - a) + (- 1)^{b}

= (- 1)^{b} .

f_{u} = v \subseteq u \sum (- 1)^{∣ u ∣ - ∣ v ∣} P_{v} f .

f_{u} = v \subseteq u \sum (- 1)^{∣ u ∣ - ∣ v ∣} P_{v} f .

(- 1)^{0 - 0} (P_{\emptyset}) (x) = (P_{\emptyset}) (x) = (P_{\emptyset}) (x) - v ⊊ \emptyset \sum f_{v} (x) .

(- 1)^{0 - 0} (P_{\emptyset}) (x) = (P_{\emptyset}) (x) = (P_{\emptyset}) (x) - v ⊊ \emptyset \sum f_{v} (x) .

δ_{w \subseteq v} = {1 : w \subseteq v 0 : otherwise.

δ_{w \subseteq v} = {1 : w \subseteq v 0 : otherwise.

f_{u} (x)

f_{u} (x)

= (P_{u} f) (x) - v ⊊ u \sum w ⊊ u \sum (- 1)^{∣ v ∣ - ∣ w ∣} (P_{w} f) (x) δ_{w \subseteq v} .

v ⊊ u \sum w ⊊ u \sum (- 1)^{∣ v ∣ - ∣ w ∣} (P_{w} f) (x) δ_{w \subseteq v}

v ⊊ u \sum w ⊊ u \sum (- 1)^{∣ v ∣ - ∣ w ∣} (P_{w} f) (x) δ_{w \subseteq v}

= w ⊊ u \sum (P_{w} f) (x) n = ∣ w ∣ \sum m - 1 v \subseteq u ∣ v ∣ = n \sum (- 1)^{∣ v ∣ - ∣ w ∣} δ_{w \subseteq v}

= w ⊊ u \sum (P_{w} f) (x) n = ∣ w ∣ \sum m - 1 (- 1)^{n - ∣ w ∣} v \subseteq u ∣ v ∣ = n \sum δ_{w \subseteq v} .

F_{u}^{(d)} : = {k \in Z^{d} : k_{u^{c}} = 0, k_{j} \neq = 0 \forall j \in u}

F_{u}^{(d)} : = {k \in Z^{d} : k_{u^{c}} = 0, k_{j} \neq = 0 \forall j \in u}

Z^{d} = u \subseteq D ⋃ F_{u}^{(d)} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\newsiamremark

remarkRemark \newsiamremarkhypothesisHypothesis

\newsiamthmclaimClaim

\headersApproximation of high-dimensional periodic functionsD. Potts, and M. Schmischke

Approximation of high-dimensional periodic functions with Fourier-based methods

Daniel Potts Chemnitz University of Technology, Germany (, http://www.tu-chemnitz.de/~potts/). [email protected]

Michael Schmischke Chemnitz University of Technology, Germany (, http://www.tu-chemnitz.de/~mischmi/). [email protected]

Abstract

In this paper we propose an approximation method for high-dimensional $1$ -periodic functions based on the multivariate ANOVA decomposition. We provide an analysis on the classical ANOVA decomposition on the torus and prove some important properties such as the inheritance of smoothness for Sobolev type spaces and the weighted Wiener algebra. We exploit special kinds of sparsity in the ANOVA decomposition with the aim to approximate a function in a scattered data or black-box approximation scenario. This method allows us to simultaneously achieve an importance ranking on dimensions and dimension interactions which is referred to as attribute ranking in some applications. In scattered data approximation we rely on a special algorithm based on the non-equispaced fast Fourier transform (or NFFT) for fast multiplication with arising Fourier matrices. For black-box approximation we choose the well-known rank-1 lattices as sampling schemes and show properties of the appearing special lattices.

keywords:

ANOVA decomposition, high-dimensional approximation, Fourier approximation

{AMS}

65T, 42B05

1 Introduction

The approximation of high-dimensional functions is an important and current topic with great interest in many applications. We consider a setting of periodic functions $f\colon\mathbb{T}^{d}\rightarrow\mathbb{C}$ , $d\in\mathbb{N}$ , over the torus $\mathbb{T}$ where certain data about the function is known. Here, we distinguish between a black-box setting, i.e., $f$ can be evaluated at points $\bm{x}\in\mathbb{T}^{d}$ at a certain cost, and a scattered data setting, i.e., sampling points $X\subseteq\mathbb{T}^{d}$ and function values $(f(\bm{x}))_{\bm{x}\in X}$ are given. Besides the natural question of wanting to find an approximation for $f$ , we want to consider the question of interpretability, i.e., analyzing the importance of the dimensions and dimension interactions of the function. In applications this is sometimes referred to as an attribute ranking.

The main tool to achieve our goals is the analysis of variance (ANOVA) decomposition [6, 45, 38, 23] which is an important model in the analysis of dimension interactions of multivariate, high-dimensional functions. It has proved useful in understanding the reason behind the success of certain quadrature methods for high-dimensional integration [40, 4, 17] and also infinite-dimensional integration [1, 19, 34]. The ANOVA decomposition decomposes a $d$ -variate function in $2^{d}$ ANOVA terms where each term belongs to a subset of $\mathcal{D}\coloneqq\{1,2,\dots,d\}$ . The single term depends only on the variables in the corresponding subset and the number of these variables is the order of the ANOVA term. In this paper we study the classical ANOVA decomposition for periodic functions and how it acts on the frequency domain. The decomposition is referred to as classical since it is based on an integral projection operator. In this setting we find relationships between ANOVA terms and the support of the frequencies as subsets of $\mathbb{Z}^{d}$ . Moreover, we prove formulas for the representation of ANOVA terms and projections.

Classical approximation methods cannot be applied for high-dimensional functions in general since the data required increases exponentially because of the curse of dimensionality. However, the observation has been made that in practical applications with multivariate, high-dimensional functions, often only the ANOVA terms of a low order are enough to describe a function, see e.g. [6]. This leads to the notion of a superposition dimension $d_{s}\in\mathcal{D}$ that limits the order of the ANOVA terms involved. Using this as a sparsity assumption to circumvent the curse of dimensionality, we consider functions where the ANOVA decomposition is mostly supported on terms of a low order, i.e., the norm of the remaining decomposition weighed by the norm of $f$ is small. This leads to a truncation of the decomposition through a superposition threshold. We consider how the previously described error can be related to the decay of Fourier coefficients and specifically the smoothness of $f$ .

We present and analyze an approximation method that uses sensitivity analysis, cf. [47, 48, 38], on the truncated ANOVA decomposition which is able to identify important ANOVA terms and incorporate this information in finding an approximation. The goal is to simplify the approximation model which yields benefits in reducing the influence of overfitting regarding the amount of data. We determine approximations of the Fourier coefficients of the function (or learn them) by solving least-squares problems. This is done through exploiting the special structure of the system matrix by identifying submatrices with the corresponding ANOVA terms. In the case of black-box approximations we are using rank-1 lattices as a spatial discretization, see e.g. [24, 25, 26, 27], and for scattered data approximation we rely on the iterative LSQR method [42] and the fast matrix-vector multiplications for Fourier matrices provided by the non-equispaced fast Fourier transform (or NFFT) introduced in [31].

The paper is organized as follows. In Section 3 we introduce the classical ANOVA decomposition and study its behavior for periodic functions with regard to the Fourier system. We prove new formulas for the Fourier coefficients of projections in Lemma 3.1 and ANOVA terms in Lemma 3.9. Moreover, we prove that functions in Sobolev type spaces and the weighted Wiener algebra inherit their smoothness to the ANOVA terms, see Theorem 3.18 and Theorem 3.20. In Section 4 we consider the truncated ANOVA decomposition and prove formulas for their Fourier coefficients, see Lemma 4.1 and Corollary 4.3. We also give direct formulas for the truncated decomposition using the projections in Lemma 4.7 and Corollary 4.9. Furthermore, we relate Sobolev type spaces and the weighted Wiener algebra to the previously introduced functions of low-dimensional structure and compute the errors in Theorem 4.11 and Theorem 4.13. Specifically, we consider a class of product and order-dependent weights, see [35, 13, 33, 14], of functions with isotropic and dominating-mixed smoothness, cf. [16, 26, 5], to obtain specific error bounds, see Corollary 4.15 and Corollary 4.19. In Section 5 we present an approximation method for functions that are of an low-dimensional structure, cf. Algorithm 1. We start by considering a black-box approximation scenario with rank-1 lattices as sampling schemes and show properties of the arising lattices in Lemma 5.4, Lemma 5.7, and Corollary 5.9. Furthermore, we discuss scattered data approximation in Section 5.2.2. The arising approximation errors are considered in Section 6 with main results being Theorem 6.1, Theorem 6.3, Theorem 6.9, and Theorem 6.13. In Section 7 we perform numerical experiments with a specific benchmark function.

2 Prerequisites and Notation

We consider multivariate 1-periodic functions $f\colon\mathbb{T}^{d}\rightarrow\mathbb{C}$ with spatial dimension $d\in\mathbb{N}$ , which are square-integrable, i.e., elements of $\mathrm{L}_{2}(\mathbb{T}^{d})$ . Those functions have a unique representation with regard to the Fourier system $\{\mathrm{e}^{2\pi\mathrm{i}\bm{k}\cdot\bm{x}}\}_{\bm{k}\in\mathbb{Z}^{d}}$ as Fourier series

[TABLE]

where $\mathrm{c}_{\bm{k}}\!\left(f\right)\coloneqq\int_{\mathbb{T}^{d}}f(\bm{x})\,\mathrm{e}^{-2\pi\mathrm{i}\bm{k}\cdot\bm{x}}\mathrm{d}\bm{x}\in\mathbb{C}$ , $\bm{k}\in\mathbb{Z}^{d}$ , are the Fourier coefficients of $f$ . Given a finite index set $I\subseteq\mathbb{Z}^{d}$ , we call the trigonometric polynomial

[TABLE]

the Fourier partial sum of $f$ with respect to the index set $I$ .

In this paper we make use of indexing with sets. First, for a given spatial dimension $d$ we denote with $\mathcal{D}=\{1,2,\dots,d\}$ the set of coordinate indices and subsets as bold small letters, e.g., $\bm{u}\subseteq\mathcal{D}$ . The complement of those subsets are always with respect to $\mathcal{D}$ , i.e., $\bm{u}^{\mathrm{c}}=\mathcal{D}\setminus\bm{u}$ . For a vector $\bm{x}\in\mathbb{C}^{d}$ we define $\bm{x}_{\bm{u}}=(x_{i})_{i\in\bm{u}}\in\mathbb{C}^{\left|\bm{u}\right|}$ . There remains a small ambiguity regarding the order of the components of $\bm{x}_{\bm{u}}$ which can be clarified if one chooses a consistent ordering, e.g., ascending order which would be a natural choice.

Furthermore, we use the $p$ -norm which is defined as

[TABLE]

for $\bm{x}\in\mathbb{R}^{d}$ . Note that the case $1\leq p<\infty$ can be expanded to $0<p<1$ , but then $\left\|\cdot\right\|_{p}$ would only be a quasi-norm. In the case $p=0$ , $\left\|\cdot\right\|_{p}$ is not a norm at all.

2.1 Rank-1 lattice

In the case of black-box approximation we are going to rely on rank-1 lattice as sampling schemes, see e.g. [24, 25, 26, 27]. For a given lattice size $M\in\mathbb{N}$ and a generating vector $\bm{z}\in\mathbb{Z}^{d}$ we define a rank-1 lattice

[TABLE]

These lattices are useful in the evaluation of trigonometric polynomials

[TABLE]

over a finite index set $I\subseteq\mathbb{Z}^{d}$ for given Fourier coefficients $\mathrm{c}_{\bm{k}}\!\left(p\right)$ . As discussed in [37], we have

[TABLE]

The computation of the sum over $l$ can be realized trough a one-dimensional FFT and therefore the evaluation of $p$ at all lattice nodes can be done using only this single FFT. The arithmetic complexity of this evaluation is in $\mathcal{O}(M\log M+d\left|I\right|)$ .

However, using a special kind of rank-1 lattice, it is possible to reconstruct the Fourier coefficients $\mathrm{c}_{\bm{k}}\!\left(p\right)$ by sampling $p$ at the nodes in $\Lambda(\bm{z},M)$ in an exact and stable way. For an index set $I\subseteq\mathbb{Z}^{d}$ we define a reconstructing rank-1 lattice $\Lambda(\bm{z},M,I)$ as a rank-1 lattice $\Lambda(\bm{z},M)$ such that the condition

[TABLE]

is fulfilled with

[TABLE]

being the difference set for $I$ . Using the nodes of a reconstructing rank-1 lattice $\Lambda(\bm{z},M,I)$ , the Fourier coefficients can be calculated as

[TABLE]

The calculation of all Fourier coefficients $\mathrm{c}_{\bm{k}}\!\left(p\right)$ , $\bm{k}\in I$ , can then be realized trough a one-dimensional FFT and the computation of the products $\bm{k}\cdot\bm{z}$ . Consequently, the arithmetic complexity of this evaluation is again in $\mathcal{O}(M\log M+d\left|I\right|)$ .

The principle of this reconstruction can be generalized to functions $f\in\mathcal{A}^{w}(\mathbb{T}^{d})\coloneqq\{f\in\mathrm{L}_{1}(\mathbb{T}^{d})\colon\left\|f\right\|_{\mathcal{A}^{w}(\mathbb{T}^{d})}\coloneqq\sum_{\bm{k}\in\mathbb{Z}^{d}}w(\bm{k})\left|\mathrm{c}_{\bm{k}}\!\left(f\right)\right|<\infty\}$ , $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ , by taking the Fourier partial sum $S_{I}f$ for a suitable index set $I\subseteq\mathbb{Z}^{d}$ and treating the evaluations of $f$ as the evaluations of the trigonometric polynomial $S_{I}f$ . Using the same idea as before, we get

[TABLE]

with a a reconstructing rank-1 lattice $\Lambda(\bm{z},M,I)$ . The error for each coefficient is

[TABLE]

with the integer dual lattice

[TABLE]

Consequently, if $\sum_{\bm{h}\in\Lambda^{\bot}(\bm{z},M)\setminus\{\bm{0}\}}\left|\mathrm{c}_{\bm{k}+\bm{h}}\!\left(f\right)\right|$ is small, the approximations $\hat{f}_{\bm{k}}$ are close to the Fourier coefficients $\mathrm{c}_{\bm{k}}\!\left(f\right)$ . For further details on this topic we refer to [26, 27] and [43, Chapter 8].

3 The classical ANOVA decomposition of 1-periodic functions

In this section we introduce the ANOVA decomposition, see e.g. [6, 38, 23], and derive new results for the periodic setting specifically with regard to the decomposition acting on the frequency domain.

We start by defining the projection operator

[TABLE]

that integrates over the variables $\bm{x}_{\bm{u}^{\mathrm{c}}}$ . For $\left|\bm{u}\right|>0$ this operator maps a function from $\mathrm{L}_{2}(\mathbb{T}^{d})$ to $\mathrm{L}_{2}(\mathbb{T}^{\left|\bm{u}\right|})$ by the Cauchy-Schwarz inequality and the image $\mathrm{P}_{\bm{u}}f$ depends only on the variables $\bm{x}_{\bm{u}}\in\mathbb{T}^{\left|\bm{u}\right|}$ . In the case of $\bm{u}=\emptyset$ , the projection gives us the integral of $f$ . We define the index set

[TABLE]

which can be identified with $\mathbb{Z}^{\left|\bm{u}\right|}$ using the mapping $\bm{k}\mapsto\bm{k}_{\bm{u}}$ . Note that we use the convention $\mathbb{Z}^{\left|\emptyset\right|}=\{0\}$ . We now prove a relationship between the Fourier coefficients of $\mathrm{P}_{\bm{u}}f$ and $f$ .

Lemma 3.1.

Let $f\in\mathrm{L}_{2}(\mathbb{T}^{d})$ and $\bm{\ell}\in\mathbb{Z}^{\left|\bm{u}\right|}$ . Then

[TABLE]

*for $\bm{k}\in\mathbb{Z}^{d}$ with $\bm{k}_{\bm{u}}=\bm{\ell}$ and $\bm{k}_{\bm{u}^{\mathrm{c}}}=\bm{0}$ . *

Proof 3.2.

We consolidate the two integrals and derive

[TABLE]

Using Lemma 3.1, we are able to write $\mathrm{P}_{\bm{u}}f$ as both, a $d$ -dimensional Fourier series $\mathrm{P}_{\bm{u}}f(\bm{x})=\sum_{\bm{k}\in\mathbb{P}_{\bm{u}}^{(d)}}\mathrm{c}_{\bm{k}}\!\left(f\right)\,\mathrm{e}^{2\pi\mathrm{i}\bm{k}\cdot\bm{x}}$ and a $\left|\bm{u}\right|$ -dimensional Fourier series $\mathrm{P}_{\bm{u}}f(\bm{x}_{\bm{u}})=\sum_{\bm{\ell}\in\mathbb{Z}^{\left|\bm{u}\right|}}\mathrm{c}_{\ell}\!\left(\mathrm{P}_{\bm{u}}f\right)\,\mathrm{e}^{2\pi\mathrm{i}\bm{\ell}\cdot\bm{x}_{\bm{u}}}$ .

Now, we recursively define the ANOVA term for $\bm{u}\subseteq\mathcal{D}$

[TABLE]

There exists a direct formula for the ANOVA terms $f_{\bm{u}}$ defined in (6).

Lemma 3.3.

Let $a\in\mathbb{N}_{0}$ and $b\in\mathbb{N}$ with $b>a$ . Then

[TABLE]

Proof 3.4.

We prove an equivalent form obtained through multiplication with $(-1)^{a}$ and an index shift

[TABLE]

Splitting the sum and applying the Binomial theorem yields

[TABLE]

Lemma 3.5.

Let $f\in\mathrm{L}_{2}(\mathbb{T}^{d})$ with $\bm{u}\subseteq\mathcal{D}$ . Then

[TABLE]

Proof 3.6.

A proof based on properties of projection operators was given in [36] while we use combinatorial arguments. We prove this statement through structural induction over the cardinality of $\bm{u}$ . For $\left|\bm{u}\right|=0$ , i.e., $\bm{u}=\emptyset$ , we have

[TABLE]

Now, let (7) be true for $\bm{v}\subseteq\mathcal{D}$ , $\left|\bm{v}\right|=0,1,\dots,m-1$ , $m\in\{1,2,\dots,d\}$ , and take a subset $\bm{u}\subseteq\mathcal{D}$ with $\left|\bm{u}\right|=m$ . We use the notation

[TABLE]

and start from the recursive expression in (6) to obtain

[TABLE]

We exchange the two sums and sum over the order of the ANOVA terms

[TABLE]

*Applying Lemma 3.3 yields the formula. *

We proceed to present a relationship between the Fourier coefficients of $f_{\bm{u}}$ and $f$ . Furthermore, we prove $f_{\bm{u}}\in\mathrm{L}_{2}(\mathbb{T}^{\left|\bm{u}\right|})$ . Therefore, we define the index set

[TABLE]

which can be identified with $(\mathbb{Z}\setminus\{0\})^{\left|\bm{u}\right|}$ using the mapping $\bm{k}\mapsto\bm{k}_{\bm{u}}$ . Here, we use the convention $(\mathbb{Z}\setminus\{0\})^{\left|\emptyset\right|}=\{0\}$ .

Lemma 3.7.

Let $\bm{u},\bm{v}\subseteq\mathcal{D}$ with $\bm{u}\neq\bm{v}$ . Then $\mathbb{F}_{\bm{u}}^{(d)}\cap\mathbb{F}_{\bm{v}}^{(d)}=\emptyset$ . Moreover, we have

[TABLE]

Proof 3.8.

Let $\bm{u},\bm{v}\subseteq\mathcal{D}$ , $\bm{u}\neq\bm{v}$ , and w.l.o.g. $\left|\bm{u}\right|\geq\left|\bm{v}\right|$ . We assume there exists a $\tilde{\bm{k}}\in\mathbb{F}_{\bm{u}}^{(d)}\cap\mathbb{F}_{\bm{v}}^{(d)}$ and first consider the case $\bm{u}\cap\bm{v}=\emptyset$ . Since $\tilde{\bm{k}}\in\mathbb{F}_{\bm{u}}^{(d)}$ we have $\tilde{\bm{k}}_{\bm{u}^{\mathrm{c}}}=\bm{0}$ and therefore $\tilde{\bm{k}}_{\bm{v}}=\bm{0}$ . This contradicts $\tilde{\bm{k}}\in\mathbb{F}_{\bm{v}}^{(d)}$ . In the case of $\bm{u}\cap\bm{v}\neq\emptyset$ there exists a $j\in\bm{v}^{\mathrm{c}}\cap\bm{u}$ . Then $\tilde{\bm{k}}\in\mathbb{F}_{\bm{v}}^{(d)}$ implies that $\tilde{k}_{j}=0$ which contradicts $\tilde{\bm{k}}\in\mathbb{F}_{\bm{u}}^{(d)}$ .

*The inclusion $\bigcup_{\bm{u}\subseteq\mathcal{D}}\mathbb{F}_{\bm{u}}^{(d)}\subseteq\mathbb{Z}^{d}$ is trivial since $\mathbb{F}_{\bm{u}}^{(d)}\subseteq\mathbb{Z}^{d}$ for every $\bm{u}\subseteq\mathcal{D}$ . To prove $\mathbb{Z}^{d}\subseteq\bigcup_{\bm{u}\subseteq\mathcal{D}}\mathbb{F}_{\bm{u}}^{(d)}$ we take a $\bm{k}\in\mathbb{Z}^{d}$ and define $\bm{u}=\{j\in\mathcal{D}\colon k_{j}\neq 0\}$ . Then $\bm{k}\in\mathbb{F}_{\bm{u}}^{(d)}$ and therefore $\bm{k}\in\bigcup_{\bm{u}\subseteq\mathcal{D}}\mathbb{F}_{\bm{u}}^{(d)}$ . *

Lemma 3.9.

Let $f\in\mathrm{L}_{2}(\mathbb{T}^{d})$ with $\bm{u}\subseteq\mathcal{D}$ and $\bm{\ell}\in\mathbb{Z}^{\left|\bm{u}\right|}$ . Then

[TABLE]

*for $\bm{k}\in\mathbb{Z}^{d}$ with $\bm{k}_{\bm{u}}=\bm{\ell}$ and $\bm{k}_{\bm{u}^{\mathrm{c}}}=\bm{0}$ . Furthermore, $f_{\bm{u}}\in\mathrm{L}_{2}(\mathbb{T}^{\left|\bm{u}\right|})$ . *

Proof 3.10.

We begin by employing the direct formula (7) to obtain

[TABLE]

We go on to prove $\mathrm{c}_{0}\!\left(f_{\bm{u}}\right)=\delta_{\bm{u},\emptyset}\cdot\mathrm{c}_{0}\!\left(f\right)$ . In this case, $\bm{k}_{\bm{v}}=\bm{0}$ and $\delta_{\bm{k}_{\bm{u}\setminus\bm{v}},\bm{0}}=1$ for every $\bm{v}\subseteq\bm{u}$ . By the Binomial Theorem, we have

[TABLE]

For the second case, we consider an $\bm{\ell}$ and with a set $\overline{\bm{v}}\subseteq\bm{u}$ such that $\emptyset\neq\overline{\bm{v}}\coloneqq\{i\in\bm{u}\colon k_{i}=0\}\neq\bm{u}$ . Then $\delta_{\bm{k}_{\bm{u}\setminus\bm{v}},\bm{0}}=1\Longleftrightarrow\overline{\bm{v}}^{\mathrm{c}}\coloneqq\bm{u}\setminus\overline{\bm{v}}\subseteq\bm{v}$ and with the Binomial Theorem we get

[TABLE]

*For the case were the entries of $\bm{\ell}$ are all nonzero, only the addend where $\bm{v}=\bm{u}$ is nonzero, i.e., $\mathrm{c}_{\ell}\!\left(f_{\bm{u}}\right)=\mathrm{c}_{\bm{k}}\!\left(f\right)$ . *

With Lemma 3.9 we have two equivalent series representations for the ANOVA term $f_{\bm{u}}$ , the $d$ -dimensional Fourier series $f_{\bm{u}}(\bm{x})=\sum_{\bm{k}\in\mathbb{F}_{\bm{u}}^{(d)}}\mathrm{c}_{\bm{k}}\!\left(f\right)\,\mathrm{e}^{2\pi\mathrm{i}\bm{k}\cdot\bm{x}}$ and the $\left|\bm{u}\right|$ -dimensional Fourier series $f_{\bm{u}}(\bm{x}_{\bm{u}})=\sum_{\bm{\ell}\in\mathbb{Z}^{\left|\bm{u}\right|}}\mathrm{c}_{\ell}\!\left(f_{\bm{u}}\right)\,\mathrm{e}^{2\pi\mathrm{i}\bm{\ell}\cdot\bm{x}_{\bm{u}}}$ with $\mathrm{c}_{\ell}\!\left(f_{\bm{u}}\right)$ as in Lemma 3.9. The ANOVA terms have the following important property.

Corollary 3.11.

Let $f\in\mathrm{L}_{2}(\mathbb{T}^{d})$ and $\bm{u},\bm{v}\subseteq\mathcal{D}$ with $\bm{u}\neq\bm{v}$ . Then the ANOVA terms $f_{\bm{u}}$ and $f_{\bm{v}}$ are orthogonal, i.e.,

[TABLE]

Proof 3.12.

We employ Lemma 3.7 and Lemma 3.9 to deduce

[TABLE]

Having defined the ANOVA terms, we now go on to the ANOVA decomposition, cf. [6, 38].

Theorem 3.13.

Let $f\in\mathrm{L}_{2}(\mathbb{T}^{d})$ , the ANOVA terms $f_{\bm{u}}$ as in (6) and the set of coordinate indices $\mathcal{D}=\{1,2,\dots,d\}$ . Then f can be uniquely decomposed as

[TABLE]

*which we call analysis of variance (ANOVA) decomposition. *

Proof 3.14.

We use that $\mathbb{Z}^{d}$ is the disjoint union of the sets $\mathbb{F}_{\bm{u}}^{(d)}$ for $\bm{u}\subseteq\mathcal{D}$ and obtain

[TABLE]

*Since the union is disjoint, the decomposition is unique. *

Remark 3.15.

The ANOVA decomposition (8) depends strongly on the projection operator $\mathrm{P}_{\bm{u}}f$ , see (4). The integral operator considered in this paper leads to the so called classical ANOVA decomposition. Another important variant is the anchored decomposition where one chooses an anchor point $\bm{c}\in\mathbb{T}^{d}$ and the projection operator is then defined as

[TABLE]

*This decomposition can for example be used in methods for the integration of high-dimensional functions such as the multivariate decomposition method, see e.g. [34, 11]. However, the error analysis may again be based on the classical ANOVA decomposition, see e.g. [12]. *

In Figure 1 we have visualized the different frequency index sets $\mathbb{F}_{\bm{u}}^{(d)}$ , $\bm{u}\subseteq\mathcal{D}$ , for a $3$ -dimensional example.

3.1 Variance and Sensitivity

In order to get a notion of the importance of single terms compared to the entire function, we define the variance of a function

[TABLE]

for real-valued $f$ . In this case, we have the equivalent formulation

[TABLE]

which yields a sensible definition for complex-valued functions $f$ . For the ANOVA terms $f_{\bm{u}}$ with $\emptyset\neq\bm{u}\subseteq\mathcal{D}$ we have $\mathrm{c}_{\bm{0}}\!\left(f_{\bm{u}}\right)=0$ and therefore

[TABLE]

Lemma 3.16.

Let $f\in\mathrm{L}_{2}(\mathbb{T}^{d})$ . Then we obtain for the variance

[TABLE]

Proof 3.17.

We show that the right-hand side equals the left-hand side by employing Lemma 3.7 and Lemma 3.9

[TABLE]

The global sensitivity indices

[TABLE]

for $\emptyset\neq\bm{u}\subseteq\mathcal{D}$ provide a comparable score to rank the importance of ANOVA terms against each other, cf. [47, 48, 38]. Clearly, we have $\sum_{\emptyset\neq\bm{u}\subseteq\mathcal{D}}\varrho(\bm{u},f)=1$ by Lemma 3.16.

We now introduce one notion of effective dimensions as proposed in [6]. Given a fixed $\delta\in(0,1]$ , the general notion of superposition dimension is defined as the minimum

[TABLE]

If we consider a particular Hilbert space $H\subseteq\mathrm{L}_{2}(\mathbb{T}^{d})$ with norm $\left\|\cdot\right\|_{H}$ , we modify the superposition dimension in the sense of this space, see e.g. [41]. For $f\in H$ and $\delta\in(0,1]$ we define the modified superposition dimension as

[TABLE]

Finally, we investigate how the smoothness of $f$ translates to projections $\mathrm{P}_{\bm{u}}f$ and ANOVA terms $f_{\bm{u}}$ . For a different setting this has been discussed in [38, 18, 19] and therein called inheritance of smoothness. In our setting, we express smoothness through special subspaces of $\mathrm{L}_{2}(\mathbb{T}^{d})$ and how $f$ being an element of those spaces translates to the projections $\mathrm{P}_{\bm{u}}f$ and ANOVA terms $f_{\bm{u}}$ . In particular, we look at Sobolev type spaces, cf. [32],

[TABLE]

and the weighted Wiener algebra

[TABLE]

with a weight function $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ for both cases.

Theorem 3.18 (Inheritance of smoothness for Sobolev type spaces).

*Let $f\in\mathrm{H}^{w}(\mathbb{T}^{d})$ with weight function $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ . Then for any weight function

$w_{\bm{u}}\colon\mathbb{Z}^{\left|\bm{u}\right|}\rightarrow[1,\infty)$ with*

[TABLE]

*we have $\mathrm{P}_{\bm{u}}f\in\mathrm{H}^{w_{\bm{u}}}(\mathbb{T}^{\left|\bm{u}\right|})$ and $f_{\bm{u}}\in\mathrm{H}^{w_{\bm{u}}}(\mathbb{T}^{\left|\bm{u}\right|})$ . *

Proof 3.19.

We show that the norm $\left\|\mathrm{P}_{\bm{u}}f\right\|_{\mathrm{H}^{w_{\bm{u}}}(\mathbb{T}^{\left|\bm{u}\right|})}$ is finite by using Lemma 3.1

[TABLE]

Analogously, we employ Lemma 3.9 to prove $f_{\bm{u}}\in\mathrm{H}^{w_{\bm{u}}}(\mathbb{T}^{\left|\bm{u}\right|})$

[TABLE]

Theorem 3.20 (Inheritance of smoothness for the weighted Wiener algebra).

Let $f\in\mathcal{A}^{w}(\mathbb{T}^{d})$ with weight function $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ . Then for any weight function $w_{\bm{u}}\colon\mathbb{Z}^{\left|\bm{u}\right|}\rightarrow[1,\infty)$ with

[TABLE]

*we have $\mathrm{P}_{\bm{u}}f\in\mathcal{A}^{w_{\bm{u}}}(\mathbb{T}^{\left|\bm{u}\right|})$ and $f_{\bm{u}}\in\mathcal{A}^{w_{\bm{u}}}(\mathbb{T}^{\left|\bm{u}\right|})$ . *

Proof 3.21.

We use Lemma 3.1 to show that $\mathrm{P}_{\bm{u}}f\in\mathcal{A}^{w}(\mathbb{T}^{\left|\bm{u}\right|})$

[TABLE]

We utilize Lemma 3.9 to prove $f_{\bm{u}}\in\mathcal{A}^{w_{\bm{u}}}(\mathbb{T}^{\left|\bm{u}\right|})$

[TABLE]

The inheritance of smoothness has special significance with regard to the numerical realization of the method presented in Section 5. It ensures that the ANOVA terms $f_{\bm{u}}$ are at least as smooth as the function $f$ in consideration which is relevant for the quality of the approximation produced by the method.

4 Truncated ANOVA decomposition

The number of ANOVA terms of a function is equal to the cardinality of $\mathcal{P}(\mathcal{D})=2^{d}$ and therefore grows exponentially in the dimension. This reflects the curse of dimensionality in a certain way and poses a problem for the approximation of a function. In this section we consider truncating the ANOVA decomposition, i.e., removing certain terms $f_{\bm{u}}$ , and therefore creating a certain form of sparsity. We define a subset of ANOVA terms as a subset of the power set of $\mathcal{D}$ , i.e., $U\subseteq\mathcal{P}(D)$ , such that the inclusion condition

[TABLE]

holds, cf. [23, Chapter 3.2]. This is necessary due to the recursive definition of the ANOVA terms, see (6).

For any subset of ANOVA terms $U$ we then define the truncated ANOVA decomposition as

[TABLE]

A specific truncation idea can be obtained by relating to the superposition dimension $\mathrm{d}^{(\mathrm{sp})}$ , see (10). For a chosen superposition threshold $d_{s}\in\mathcal{D}$ (that may or may not be equal to the superposition dimension $\mathrm{d}^{(\mathrm{sp})}$ ), we define $U_{d_{s}}\coloneqq\{\bm{u}\subseteq\mathcal{D}\colon\left|\bm{u}\right|\leq d_{s}\}$ and $\mathrm{T}_{d_{s}}\coloneqq\mathrm{T}_{U_{d_{s}}}$ . We subsequently prove properties of both $\mathrm{T}_{U}$ in general and $\mathrm{T}_{d_{s}}$ in particular.

Lemma 4.1.

Let $f\in\mathrm{L}_{2}(\mathbb{T}^{d})$ and $U\subseteq\mathcal{P}(D)$ be a subset of ANOVA terms. Then $\mathrm{T}_{U}f\in\mathrm{L}_{2}(\mathbb{T}^{d})$ and for $\bm{k}\in\mathbb{Z}^{d}$ the Fourier coefficient is

[TABLE]

Proof 4.2.

Clearly, we have $\mathrm{T}_{U}f\in\mathrm{L}_{2}(\mathbb{T}^{d})$ . Let now $\bm{k}\in\mathbb{Z}^{d}$ . Then there exists a $\bm{u}_{0}\subseteq\mathcal{D}$ such that $\bm{k}\in\mathbb{F}_{\bm{u}_{0}}^{(d)}$ . We employ Lemma 3.9 and obtain

[TABLE]

Corollary 4.3.

Let $f\in\mathrm{L}_{2}(\mathbb{T}^{d})$ and $d_{s}\in\mathcal{D}$ a superposition threshold. Then $\mathrm{T}_{d_{s}}f\in\mathrm{L}_{2}(\mathbb{T}^{d})$ and only the Fourier coefficients corresponding to $d_{s}$ -sparse frequencies are nonzero, i.e.,

[TABLE]

Proof 4.4.

*Since $U_{d_{s}}$ is a subset of ANOVA terms, $\mathrm{T}_{d_{s}}f\in\mathrm{L}_{2}(\mathbb{T}^{d})$ follows directly from Lemma 4.1. Moreover, $\exists\bm{u}\in U_{d_{s}}\colon\bm{k}\in\mathbb{F}_{\bm{u}}^{(d)}\Longleftrightarrow\left\|\bm{k}\right\|_{0}\leq d_{s}$ . *

The following lemma shows that the number of terms in $U_{d_{s}}$ is polynomial in $d$ for a fixed $d_{s}$ and therefore allows us to circumvent the curse of dimensionality in terms of the number of sets.

Lemma 4.5.

We estimate the cardinality of $\left|U_{d_{s}}\right|$ as follows

[TABLE]

*i.e., the number of terms in $U_{d_{s}}$ has polynomial growth in $d$ for fixed $d_{s}\in\mathcal{D}\setminus\{d\}$ . *

Proof 4.6.

We estimate the sum as follows

[TABLE]

*Estimating the sum by the Taylor series for $\mathrm{e}^{d_{s}}$ yields the statement. *

In the following we show direct formulas for the truncated ANOVA decomposition based on the projections similarly as for the ANOVA terms, see (7).

Lemma 4.7.

Let $f\in\mathrm{L}_{2}(\mathbb{T}^{d})$ and $U\subseteq\mathcal{P}(D)$ a subset of ANOVA terms. Then we have the direct formula

[TABLE]

Proof 4.8.

We apply equation (7) and obtain immediately

[TABLE]

Corollary 4.9.

Let $f\in\mathrm{L}_{2}(\mathbb{T}^{d})$ and $d_{s}\in\mathcal{D}$ a superposition threshold. Then we have the direct formula

[TABLE]

Proof 4.10.

Since the equality

[TABLE]

*holds, we employ Lemma 4.7 and the formula is proven. *

The truncated ANOVA decomposition plays a major role in our approximation approach presented in Section 5. Therefore we are interested in functions that can be approximated well by a truncated ANOVA decomposition. Specifically, we are looking to characterize functions such that the truncation operation by $\mathrm{T}_{U}f$ for different sets $U$ retains most of the function, i.e., we have a relative error

[TABLE]

with $\varepsilon>0$ , and $H_{1},H_{2}$ certain subspaces of $\mathrm{L}_{2}(\mathbb{T}^{d})$ . It is especially interesting to characterize these functions by properties like the smoothness. To this end, we start by proving general bounds for Sobolev type spaces $\mathrm{H}^{w}(\mathbb{T}^{d})$ and the weighted Wiener algebra $\mathcal{A}^{w}(\mathbb{T}^{d})$ to later relate this to weight functions $w$ defined by specific kinds of smoothness.

Moreover, this can be related to the superposition dimension $\mathrm{d}^{(\mathrm{sp})}$ for a $\delta\in(0,1]$ , see (10). Let $H_{1}=\mathrm{L}_{2}(\mathbb{T}^{d})$ and $H_{2}\in\{\mathrm{H}^{w}(\mathbb{T}^{d}),\mathcal{A}^{w}(\mathbb{T}^{d})\}$ for a weight function $w$ . If we choose truncation by a superposition threshold $d_{s}\in\mathcal{D}$ then the bound on the right-hand side $\varepsilon(d_{s})\in(0,1)$ depends on $d_{s}$ . Moreover, we have

[TABLE]

which follows from $\left\|f-\mathrm{T}_{d_{s}}f\right\|_{\mathrm{L}_{2}(\mathbb{T}^{d})}^{2}=\sum_{\left|\bm{u}\right|>d_{s}}\left\|f_{\bm{u}}\right\|_{\mathrm{L}_{2}(\mathbb{T}^{d})}^{2}$ . The modified superposition dimension $\mathrm{d}^{(\mathrm{sp})}$ will now be smaller or equal to $\min\{d_{s}\in\mathcal{D}\colon\varepsilon(d_{s})\leq 1-\delta\}$ , i.e., truncation by this minimum as superposition threshold is guaranteed to be effective in relation to $\delta$ .

Theorem 4.11.

Let $f\in\mathrm{H}^{w}(\mathbb{T}^{d})$ with weight function $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ . Then

[TABLE]

Proof 4.12.

We employ Parseval’s identity and Lemma 4.1 to derive

[TABLE]

Theorem 4.13.

Let $f\in\mathcal{A}^{w}(\mathbb{T}^{d})$ with weight function $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ . Then

[TABLE]

For $f\in\mathrm{H}^{w}(\mathbb{T}^{d})$ with a weight function $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ such that $\{1/w(\bm{k})\}_{\bm{k}\in\mathbb{Z}^{d}}\in\ell_{2}$ we have

[TABLE]

Proof 4.14.

We estimate the $\mathrm{L}_{\infty}$ -norm by the sum of the absolute values of the Fourier coefficients and then use Lemma 4.1

[TABLE]

Employing the Cauchy-Schwarz inequality in (14) instead of extracting the minimum yields

[TABLE]

*The condition $\{1/w(\bm{k})\}_{\bm{k}\in\mathbb{Z}^{d}}\in\ell_{2}$ assures that the sum which appears in the bound is finite. *

In the following, we relate the truncation of $f$ by the operator $\mathrm{T}_{d_{s}}$ with the smoothness of $f$ . To this end, we introduce the weights

[TABLE]

with $\operatorname*{supp}\bm{k}=\{i\in\mathcal{D}\colon k_{i}\neq 0\}$ and parameters $\beta\geq 0$ , and $\alpha>-\beta$ . The parameters $\alpha,\beta$ , and the weight $\gamma_{\bm{u}}$ , $\bm{u}\subseteq\mathcal{D}$ , regulate the decay of the Fourier coefficients. Specifically, the parameter $\alpha$ is regulating the isotropic smoothness and $\beta$ the dominating mixed smoothness, cf. [7]. Moreover, $\bm{\gamma}$ controls the influence of the different dimensions. We choose a POD (product and order-dependent) structure for $\gamma_{\bm{u}}$ such that

[TABLE]

where $\bm{\Gamma}\in(0,1]^{d}$ is nonincreasing and $\bm{\gamma}=(\gamma_{i})_{i=1}^{d}\in(0,1]^{d}$ . The POD structure is motivated by the application of quasi-Monte Carlo methods for PDEs with random coefficients, cf. [35, 13, 33, 14]. Similar weights for isotropic and dominating mixed smoothness have been considered in [16, 26, 5]. Moreover, the Sobolev type spaces may also be referred to as weighted Korobov spaces, cf. [46] for product weights and [9] for general weights.

We now use the previously obtained bounds for general weight functions $w$ and derive results for the weights $w^{\alpha,\beta}$ from (15). We focus on the subsets of ANOVA terms $U_{d_{s}}$ defined by a superposition threshold $d_{s}\in\mathcal{D}$ .

Corollary 4.15.

Let $f\in\mathcal{A}^{w^{\alpha,\beta}}(\mathbb{T}^{d})$ with weight function from (15) with POD structure (16), $\beta\geq 0$ , $\alpha>-\beta$ , $\bm{\Gamma}\in(0,1]^{d}$ , and $\bm{\gamma}\in(0,1]^{d}$ . Then

[TABLE]

*where $\bm{\gamma}^{\ast}$ is the non-increasing rearrangement of $\bm{\gamma}$ . *

Proof 4.16.

We use Theorem 4.13 and calculate the bound for the weight function $w^{\alpha,\beta,\bm{\gamma}}$ by computing the minimum

[TABLE]

Since $\bm{\Gamma}$ is non-increasing by definition, $\Gamma_{d_{s}+1}^{-1}$ has to be equal to the smallest value. The frequencies in $\mathbb{F}_{\bm{u}}^{(d)}$ have exactly $\left|\bm{u}\right|$ nonzero entries, therefore we get

[TABLE]

*The remaining product becomes minimal for the product of the $d_{s}+1$ smallest entries in $\bm{\gamma}$ which yields the statement. *

Lemma 4.17.

Let $n\in\mathcal{D}$ and $\bm{\gamma}\in(0,1]^{d}$ . Then

[TABLE]

Proof 4.18.

We rewrite the sum as follows

[TABLE]

Then every single sum can be estimated by $\left\|\bm{\gamma}\right\|_{2}^{2}$ , i.e.,

[TABLE]

*for $j\in\{2,3,\dots,d\}$ with equality for $j=1$ . *

Corollary 4.19.

Let $f\in\mathrm{H}^{w^{\alpha,\beta}}(\mathbb{T}^{d})$ with weight function from (15) with POD structure (16), $\beta\geq 0$ , $\alpha>-\beta$ , $\bm{\Gamma}\in(0,1]^{d}$ , and $\bm{\gamma}\in(0,1]^{d}$ . Then

[TABLE]

where $\bm{\gamma}^{\ast}=(\gamma_{s}^{\ast})_{s=1}^{d}$ is the non-increasing rearrangement of $\bm{\gamma}$ . For functions with isotropic smoothness $\alpha=0$ and dominating mixed smoothness $\beta>1/2$ we have

[TABLE]

where $\zeta$ is the Riemann zeta function. Exponential decay for $\Gamma_{s}$ , i.e., $\Gamma_{s}=c^{s}$ , $0<c\leq 1$ , such that the condition

[TABLE]

holds, yields the bound

[TABLE]

Proof 4.20.

The bound from statement (18) is a consequence of Theorem 4.11 and can be calculated analogously to the proof of Corollary 4.15. For the second statement, we calculate the constant in the bound from Theorem 4.13. We use Lemma 3.7 and the product structure of the weights $w^{\alpha,\beta}(\bm{k})$ to obtain

[TABLE]

We find an explicit form by replacing the sums with the Riemann zeta function

[TABLE]

Applying Lemma 4.17 then gives us the upper bound

[TABLE]

If we choose an exponential decay for $\Gamma_{n}$ , i.e., $\Gamma_{n}\coloneqq c^{n},0<c\leq 1$ , the explicit upper bound becomes

[TABLE]

*where $q\coloneqq 2c^{2}\left(\zeta(2\beta)-1\right)\left\|\bm{\gamma}\right\|_{2}^{2}$ with $0<q<1$ because of the condition (19). *

The bound in Corollary 4.15 and (18) in Corollary 4.19 are independent of the spatial dimensions $d$ of the functions $f$ as long as they have the same superposition threshold and the norm stays the same. This allows us to circumvent the curse of dimensionality here and use the ANOVA terms in $U_{d_{s}}$ for a superposition threshold $d_{s}\in\mathcal{D}$ . The bound (20) can also be considered for $d\rightarrow\infty$ . The dependence on the dimension $d$ is contained within the norm $\left\|\bm{\gamma}\right\|_{2}^{2}$ . Choosing a square-summable sequence $\{\gamma_{\ell}\}_{\ell\in\mathbb{N}}$ results in an upper bound for $\left\|\bm{\gamma}\right\|_{2}$ for any $d\rightarrow\infty$ . In this case the bound can be made independent of $d$ by the condition (19).

Figure 2 shows the different bounds for weights $w^{\alpha,\beta}$ with $\bm{\gamma}=(1/s)_{s=1}^{9}$ and $\bm{\Gamma}=(\pi^{-s}\sqrt{3}^{s})_{s=1}^{9}$ , see (15). With regard to the superposition dimension $\mathrm{d}^{(\mathrm{sp})}$ for $\mathrm{H}^{w^{\alpha,\beta}}(\mathbb{T}^{d})$ , cf. (10), one may interpret this as follows: Given $f\in\mathrm{H}^{w^{\alpha,\beta}}(\mathbb{T}^{d})$ , the value $\varepsilon(\alpha,\beta)\in(0,1)$ of the bound in part (a) of Figure 2 tells us that for $\delta=1-\varepsilon(\alpha,\beta)^{2}$ the superposition dimension $\mathrm{d}^{(\mathrm{sp})}$ is smaller or equal to the superposition threshold $d_{s}=3$ , e.g., $\varepsilon(0,1)\approx 0.0008$ and therefore $\delta=0.99999936$ .

5 ANOVA approximation method

We consider the general problem of approximating a periodic function $f\colon\mathbb{T}^{d}\rightarrow\mathbb{C}$ given certain function evaluations of $f$ . Specifically, we distinguish two approximation scenarios – black-box approximation and scattered data approximation. In the case of black-box approximation, we are able to evaluate $f$ at any given point $\bm{x}\in\mathbb{T}^{d}$ . Since the evaluations come at a certain cost, we aim to keep them minimal or require a certain trade-off. For scattered data approximation we have a finite set of nodes $X\subseteq\mathbb{T}^{d}$ and know the function values $\bm{y}=(f(\bm{x}))_{\bm{x}\in X}$ . Here, one cannot add more nodes to $X$ or choose the locations of the nodes. Both scenarios have a high relevance for problems in various applications.

In this section, we consider an approximation scheme for high-dimensional, periodic functions of a low-dimensional structure, i.e., functions with a small superposition dimension $\mathrm{d}^{(\mathrm{sp})}\in\mathcal{D}$ for a $\delta\in(0,1]$ that is close to one, cf. (10). In this case the truncation by $\mathrm{T}_{d_{s}}$ with a small superposition threshold $d_{s}\in\mathcal{D}$ will be effective. It has been observed that functions in many practical applications belong to such a class, see e.g. [6]. In Section 4 we have considered errors for functions of dominating-mixed and isotropic smoothness defined trough the decay of the Fourier coefficients and therefore obtained an upper bound for the modified superposition dimension $\mathrm{d}^{(\mathrm{sp})}$ from (10). Considering Figure 2, we know that e.g. POD weights lead to a decay such that the functions are of a low-dimensional structure.

The approximation scheme can be viewed in both approximation scenarios although the details are different. We work for now with the node set $X$ as well as function evaluations $\bm{y}$ and keep in mind that $X$ may also be chosen if we are in the black-box case. The first step is to reduce the ANOVA decomposition to the terms in $U_{d_{s}}$ , i.e., we approximate

[TABLE]

The Fourier coefficients $\mathrm{c}_{\bm{k}}\!\left(\mathrm{T}_{d_{s}}f\right)$ can only be nonzero if the frequency $\bm{k}$ is at most $d_{s}$ -sparse, i.e., $\left\|\bm{k}\right\|_{0}\leq d_{s}$ , see Corollary 4.3. Based on this, we aim to approximate $f$ by a Fourier partial sum $S_{I}f$ with a finite index set

[TABLE]

The challenge is to determine an appropriate index set $I$ . To this end, we employ a special scheme to determine frequency locations based on the ANOVA terms and an importance ranking on them.

We call the first step active set detection and its aim is to determine an importance ranking on the terms $f_{\bm{u}}$ with $\bm{u}\in U_{d_{s}}$ based on the global sensitivity indices $\varrho(\bm{u},f)$ , cf. (9). This information is also highly relevant to interpret relations in our data $X$ and $\bm{y}$ .

Based on the sensitivity indices we build an active set of ANOVA terms $U\subseteq U_{d_{s}}$ . This relates to the importance of frequencies and therefore information on how to choose the index set $I$ from (21). Reducing the number of ANOVA terms and in turn the number of frequencies leads to a reduction of the model complexity. The effects of overfitting are therefore lessened. In Section 5.1 we consider the details of the active set detection and in Section 5.3 the approximation with an active set as well as approximation errors.

5.1 Active set detection

The method assumes that the underlying function $f$ is of a low-dimensional structure, i.e., $f\approx\mathrm{T}_{d_{s}}f$ for some superposition threshold $d_{s}\in\mathcal{D}$ . The goal in the active set detection step is to determine an importance ranking for the ANOVA terms. In order to do this, we choose an appropriate search index set. Since we have no a-priori knowledge about the importance of the ANOVA terms or the smoothness of the function $f$ , we work with order-dependent finite index sets $I_{0}=\{0\},I_{1}\subseteq(\mathbb{Z}\setminus\{0\}),\dots,I_{d_{s}}\subseteq(\mathbb{Z}\setminus\{0\})^{d_{s}}$ . This achieves that two ANOVA terms $f_{\bm{u}}$ and $f_{\bm{v}}$ with $\left|\bm{u}\right|=\left|\bm{v}\right|$ are supported on equivalent index sets. We then use the projection operator

[TABLE]

to project the index sets and obtain

[TABLE]

This leads to the approximation by a Fourier partial sum

[TABLE]

The Fourier coefficients $\mathrm{c}_{\bm{k}}\!\left(f\right)$ in (24) are unknown and we aim to determine approximations for them from the data $X$ and $\bm{y}$ . To this end, we consider the least-squares problem

[TABLE]

with Fourier matrix $\bm{F}_{I(U_{d_{s}})}=\left(\mathrm{e}^{2\pi\mathrm{i}\bm{k}\cdot\bm{x}}\right)_{\bm{x}\in X,\bm{k}\in I(U_{d_{s}})}$ . If the Fourier matrix has full rank, the elements of the solution vector $\hat{\bm{f}}_{\text{sol}}=(\hat{f}_{\bm{k}})_{\bm{k}\in I(U_{d_{s}})}$ are the unique least-squares approximation to the Fourier coefficients, i.e., $\hat{f}_{\bm{k}}\approx\mathrm{c}_{\bm{k}}\!\left(f\right)$ , with respect to $X$ and $\bm{y}$ . Depending on the approximation scenario, there are different methods of solving least-squares problems of the type (25). We refer to Section 5.2 for details.

We use the approximate Fourier coefficients $\hat{f}_{\bm{k}}$ to build the approximate Fourier partial sum

[TABLE]

which provides an initial approximation to the function $f$ . In order to achieve a Fourier matrix $\bm{F}_{I(U_{d_{s}})}$ with full rank and combat the effects of overfitting, we may need to severely limit the number of frequencies in the order-dependent sets $I_{1}$ , $I_{2}$ , …, $I_{d_{s}}$ . Details on this will be considered in the following subsections for the specific approximation scenarios.

In order to determine an importance ranking on the ANOVA terms, we assume that the global sensitivity indices of $S_{I(U_{d_{s}})}^{X}f$ and $f$ behave similarly, i.e., it holds that

[TABLE]

for $\bm{u}_{1},\bm{u}_{2}\in U_{d_{s}}$ . This allows us to use a threshold vector $\bm{\varepsilon}\in[0,1]^{d_{s}}$ to define an active set of ANOVA terms that only contains the important terms with respect to $\bm{\varepsilon}$

[TABLE]

The inclusion condition (11) is fulfilled by definition. We reduce the ANOVA decomposition to this set of terms to determine an approximation for $f$ in Section 5.3.

5.2 Least-squares approximation

In this section, we discuss the solution of least-squares problems of the form

[TABLE]

with a Fourier matrix $\bm{F}_{I(U)}=\left(\mathrm{e}^{2\pi\mathrm{i}\bm{k}\cdot\bm{x}}\right)_{\bm{x}\in X,\bm{k}\in I(U)}$ . Here, $U$ is an arbitrary subset of ANOVA terms and for each term we have a given finite frequency index set $I_{\bm{u}}\subseteq(\mathbb{Z}\setminus\{0\})^{\left|\bm{u}\right|}$ . The set

[TABLE]

is obtained through the projections (22).

The following remark shows that the Fourier matrix can be structured with respect to the ANOVA terms. Moreover, we can decompose the matrix-vector multiplications with both, $\bm{F}_{I(U)}$ and its adjoint $\bm{F}_{I(U)}^{\ast}$ .

Remark 5.1.

Let $\bm{F}_{I(U)}$ be a Fourier matrix with respect to a node set $X$ and an index set $I(U)$ with a subset of ANOVA terms $U\subseteq\mathcal{P}(D)$ and index sets $I_{\bm{u}}\subseteq(\mathbb{Z}\setminus\{0\})^{\left|\bm{u}\right|}$ , $\bm{u}\in U$ . Then

[TABLE]

where $\bm{u}_{1},\bm{u}_{2},\dots,\bm{u}_{n}$ with $n={\left|U\right|}$ is a numbering of the subsets of coordinate indices in $U$ such that $\hat{\bm{f}}=\left(\hat{\bm{f}}_{\bm{u}_{1}}\,\,\,\hat{\bm{f}}_{\bm{u}_{2}}\,\,\,\cdots\,\,\,\hat{\bm{f}}_{\bm{u}_{n}}\right)^{\top}$ . The Fourier matrices are $\bm{F}_{\bm{u}}=\left(\mathrm{e}^{2\pi\mathrm{i}\bm{\ell}\cdot\bm{x}_{\bm{u}}}\right)_{\bm{x}\in X,\bm{\ell}\in I_{\bm{u}}}$ . The matrix-vector product with $\bm{F}$ can therefore be decomposed as

[TABLE]

with vector components $\hat{\bm{f}}_{\bm{u}}$ . For the adjoint product $\bm{F}^{\ast}\bm{f}$ with a vector $\bm{f}\in\mathbb{C}^{\left|X\right|}$ we obtain the result $\hat{\bm{a}}\in\mathbb{C}^{\left|I(U)\right|}$ by computing the products

[TABLE]

*Then we have the result vector $\hat{\bm{a}}=\left(\hat{\bm{a}}_{\bm{u}_{1}}\,\,\,\hat{\bm{a}}_{\bm{u}_{2}}\,\,\,\cdots\,\,\,\hat{\bm{a}}_{\bm{u}_{n}}\right)^{\top}$ . *

5.2.1 Black-box scenario

In the case of black-box approximation, i.e., the set $X$ can be chosen, we have to determine an appropriate special discretization for index sets of the type $I(U)$ . Here, we have different possibilities. One might think of rank-1 lattices that have been used for integration before, see e.g. [8], and approximation, see e.g. [26, 30]. For a general introduction to lattice rules, we refer to Section 2.1. Sparse grid sampling related to the Smolyak algorithm is a further possibility, cf. [15, 21, 22, 23].

In the following, we focus on using reconstructing single rank-1 lattice for function approximation. If we have a reconstructing single rank-1 lattice $\Lambda(\bm{z},M,I(U))\subseteq\mathbb{Z}^{d}$ for a generating vector $\bm{z}\in\mathbb{Z}^{d}$ and size $M\in\mathbb{N}$ with respect to an index set $I(U)$ , then

[TABLE]

with $\mathrm{I}$ the identity matrix, see [43, Chapter 8.2]. Then the solution to problem (29) is unique and given by the multiplication of the Moore-Penrose inverse $\bm{F}_{I(U)}^{\dagger}$ with $\bm{y}$ , see e.g. [3]. Through the property (31) the Moore-Penrose inverse is simplified to

[TABLE]

i.e., a multiplication with the adjoint matrix. This allows us to efficiently compute approximations for the Fourier coefficients of $f$ if the nodes form a reconstructing rank-1 lattice.

It remains the issue of determining such a reconstructing rank-1 lattice given an index set of type $I(U)$ . In [43, Theorem 8.16] it was shown that reconstructing lattices exist if the lattice size $M$ is sufficiently large. Since the evaluations of $f$ come at a certain cost, it is necessary to consider the lattice size for our special types of index sets which we do in the following.

An important quantity to get estimations on the lattice size is the difference set $\mathcal{D}(I(U))$ from (2) since the result [43, Theorem 8.16] tells us that there exists a reconstructing rank-1 lattice with prime cardinality

[TABLE]

In the following, we proof properties and show estimates on the cardinality of both $I(U)$ and $\mathcal{D}(I(U))$ .

Lemma 5.2.

Let $U\subseteq\mathcal{P}(\mathcal{D})$ be a subset of ANOVA terms and $I_{\bm{u}}\subseteq\mathbb{Z}^{\left|\bm{u}\right|}$ , $\bm{u}\in U$ , finite symmetric frequency sets. Then we have

[TABLE]

Proof 5.3.

*It is easy to see that $\bigcup_{\begin{subarray}{c}\bm{u}\in U\\ \bm{v}\subseteq\bm{u}\end{subarray}}\{\bm{k}-\bm{h}\colon\bm{k}\in\mathrm{P}_{\bm{u}}I_{\bm{u}},\bm{h}\in\mathrm{P}_{\bm{v}}I_{\bm{v}}\}\subseteq\mathcal{D}(I(U))$ since $\mathrm{P}_{\bm{u}}I_{\bm{u}}\subseteq I(U)$ for every $\bm{u}\in U$ and $\bm{v}\in U$ for all $\bm{v}\subseteq\bm{u}\in U$ due to (11). In order to show the other inclusion we take an element $\bm{\ell}\in\mathcal{D}(I(U))$ . By the uniqueness property of the ANOVA decomposition we know that there exists $\bm{u},\bm{v}\in U$ such that $\bm{\ell}=\bm{k}-\bm{h}$ with $\bm{k}\in\mathrm{P}_{\bm{u}}I_{\bm{u}}$ and $\bm{h}\in\mathrm{P}_{\bm{v}}I_{\bm{v}}$ . Taking the symmetry of the index sets $I_{\bm{u}}$ into account, we have proven the statement. *

The following lemma gives an estimate for the size of the difference set of index sets of type $I(U)$ if there exists an upper bound on the cardinality of the term dependent sets $I_{\bm{u}}$ .

Lemma 5.4.

Let $U$ be a subset of ANOVA terms and $I_{\bm{u}}\subseteq(\mathbb{Z}\setminus\{0\})^{\left|\bm{u}\right|},\bm{u}\in U,$ symmetric frequency sets. Then the cardinality of the difference set of $I(U)$ is bounded by

[TABLE]

Proof 5.5.

We estimate the cardinality of the difference set by applying Lemma 5.2

[TABLE]

Here, we do not have equality since the union in Lemma 5.2 is not necessarily disjoint. Applying the upper bound on the cardinality of the sets $I_{\bm{u}}$ , we arrive at

[TABLE]

Remark 5.6.

The cardinality of $U_{d_{s}}$ is bounded by $(\mathrm{e}\cdot d/d_{s})^{d_{s}}$ , see Lemma 4.5. Therefore the estimate in (33) becomes

[TABLE]

In the following, we consider special term-dependent frequency index sets of the structure

[TABLE]

with a subset of coordinate indices $\emptyset\neq\bm{u}\subseteq\mathcal{D}$ , a weight function $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ and cut-off $N_{\bm{u}}\in\mathbb{N}$ . For a given subset of ANOVA terms $U\subseteq\mathcal{P}(\mathcal{D})$ we estimate the cardinalities of both, $I(U)$ and the difference set $\mathcal{D}(I(U))$ .

Lemma 5.7.

Let $U\subseteq\mathcal{P}(\mathcal{D})$ be a subset of ANOVA terms, $I_{\emptyset}=\{\bm{0}\}$ , and $I_{\bm{u}}$ , $\emptyset\neq\bm{u}\in U$ , finite frequency sets as in (34) for a weight function $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ and $N_{\bm{u}}\in\mathbb{N}$ . Moreover, let $h_{\min}\colon\mathbb{N}\rightarrow[1,\infty)$ and $h_{\max}\colon\mathbb{N}\rightarrow[1,\infty)$ be functions such that

[TABLE]

with $\bm{u}_{\min}=\operatorname*{arg\,min}_{\bm{u}\in U\setminus\{\emptyset\}}\left|I_{\bm{u}}\right|$ , $\bm{u}_{\max}=\operatorname*{arg\,max}_{\bm{u}\in U}\left|I_{\bm{u}}\right|$ , and $0<c\leq C$ . Then we have for the asymptotic behavior of the cardinality of $I(U)$

[TABLE]

*The constants do not depend on the spatial dimension $d$ . *

Proof 5.8.

Since the projected sets $\mathrm{P}_{\bm{u}}I_{\bm{u}}$ , $\bm{u}\in U$ , are disjoint, we have

[TABLE]

In order to show the upper bound, we estimate the cardinality of each index set by $h_{\max}$

[TABLE]

*The lower bound follows with similar arguments. *

Corollary 5.9.

Let $U\subseteq\mathcal{P}(\mathcal{D})$ be a subset of ANOVA terms, $I_{\emptyset}=\{\bm{0}\}$ , and $I_{\bm{u}}$ , $\emptyset\neq\bm{u}\in U$ , finite symmetric frequency sets as in (34) for a weight function $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ and $N_{\bm{u}}\in\mathbb{N}$ . Moreover, let $h_{\max}$ be a function as in Lemma 5.7. Then

[TABLE]

Proof 5.10.

*The corollary is a direct consequence of Lemma 5.4 and Lemma 5.7. *

We may apply [43, Algorithm 8.17] to construct the reconstructing rank-1 lattice $\Lambda(\bm{z},M,I(U))$ via a component-by-component approach. Choosing the set $X=\Lambda(\bm{z},M,I(U))$ as sampling nodes yields a Moore-Penrose inverse of type (32) and we are able to compute the solution to (29) by multiplying with the adjoint Fourier matrix. This computation can be done efficiently using a lattice fast Fourier transform or LFFT, see [43, Section 8.2.2].

5.2.2 Scattered data scenario

In this section, we consider the scenario of scattered data approximation, i.e., we have a fixed set of nodes $X\subseteq\mathbb{T}^{d}$ . Here, we aim to solve the least-squares problem (29) with the iterative LSQR method [42]. Specifically, we are interested in the matrix-free variant, i.e., we do not have to construct the system matrix $\bm{F}_{I(U)}\in\mathbb{C}^{\left|X\right|,\left|I(U)\right|}$ explicitly. The curse of dimensionality would quickly lead to the size of the matrix becoming intractable. The matrix-free variant requires two algorithms, one which takes a vector $\bm{a}\in\mathbb{C}^{\left|I(U)\right|}$ as an input and returns the result of the matrix-vector multiplication $\bm{F}_{I(U)}\bm{a}$ and one that takes $\hat{\bm{a}}\in\mathbb{C}^{\left|X\right|}$ as an input and returns the result of $\bm{F}^{\ast}_{I(U)}\hat{\bm{a}}$ . If we take Remark 5.1 into account, it is only necessary to provide algorithms for fast multiplication with Fourier matrices $\bm{F}_{I_{\bm{u}}}\in\mathbb{C}^{\left|X\right|,\left|I_{\bm{u}}\right|}$ , $\bm{u}\in U$ .

The existence of such algorithms depends on the choice of the specific index sets $I_{\bm{u}}$ . For full grids, i.e., frequency sets of the type

[TABLE]

the non-equispaced fast Fourier transform (NFFT) was introduced in [31]. Moreover, for hyperbolic cross index sets of the form

[TABLE]

with $\hat{G}_{\bm{n}}=\times_{s=1}^{\left|\bm{u}\right|}\hat{G}_{n_{s}}$ and $\hat{G}_{n_{s}}=(-2^{n_{s}-1},2^{n_{s}-1}]^{\left|\bm{u}\right|}\cap\mathbb{Z}$ , we have the non-equispaced hyperbolic cross fast Fourier transform (NHCFFT), cf. [10].

5.3 Approximation with active set

Now that we have obtained the active set $U_{X,\bm{y}}^{(\bm{\mathrm{\varepsilon}})}$ from (28), we aim to construct an approximation using only these ANOVA terms. The global sensitivity indices $\varrho(\bm{u},S_{I(U_{d_{s}})}^{X}f)$ calculated from the approximation $S_{I(U_{d_{s}})}^{X}f$ in (26) provide us with a basis to choose term-dependent frequency index sets $I_{\bm{u}}\subseteq(\mathbb{Z}\setminus\{0\})^{\left|\bm{u}\right|}$ , $\emptyset\neq\bm{u}\in U_{X,\bm{y}}^{(\bm{\mathrm{\varepsilon}})}$ . A higher sensitivity index suggests that the term is more important to the function and therefore a larger corresponding index set could be advisable.

We project the index sets as before to obtain $I(U_{X,\bm{y}}^{(\bm{\mathrm{\varepsilon}})})$ , see (30). Note that in general and depending on the threshold $\bm{\varepsilon}$ , we have reduced the number of frequencies significantly. This is a sensible measure to reduce the effects of overfitting. Now, we approximate $f$ by the Fourier partial sum

[TABLE]

The Fourier coefficients $\mathrm{c}_{\bm{k}}\!\left(f\right)$ are again unknown and we determine them by least-squares approximation from $X$ and $\bm{y}$ . The unique solution is given by

[TABLE]

if the Fourier matrix $\bm{F}_{I(U_{X,\bm{y}}^{(\bm{\mathrm{\varepsilon}})})}=\left(\mathrm{e}^{2\pi\mathrm{i}\bm{k}\cdot\bm{x}}\right)_{\bm{x}\in X,\bm{k}\in I(U_{X,\bm{y}}^{(\bm{\mathrm{\varepsilon}})})}$ has full rank. Details on how to solve this system for scattered data and black-box approximation can be found in Section 5.2. We use the elements of the solutions vector $\hat{\bm{f}}_{\text{sol}}=(\hat{f}_{\bm{k}})_{\bm{k}\in I(U_{X,\bm{y}}^{(\bm{\mathrm{\varepsilon}})})}$ to form the approximate Fourier partial sum and our solution

[TABLE]

The following algorithm summarizes the proposed method.

6 Error analysis

The error of our approximation method measured in the norm of some space $H\subseteq\mathrm{L}_{2}(\mathbb{T}^{d})$ can be decomposed into multiple components by the triangle inequality

[TABLE]

for an active set of ANOVA terms $U\subseteq U_{d_{s}}$ with superposition threshold $d_{s}\in\mathcal{D}$ . We distinguish between the ANOVA truncation error and the approximation error. Here, the analysis of the ANOVA truncation error is independent of the concrete approximation problem (35) and the scenario (scattered data or black-box).

6.1 ANOVA truncation error

The ANOVA truncation error is related to the truncation of the ANOVA decomposition to the set $U_{d_{s}}$ with superposition threshold $d_{s}\in\mathcal{D}$ and the active set $U\subseteq U_{d_{s}}$ . We can separate the ANOVA truncation error as follows

[TABLE]

Here, we bring $\mathrm{T}_{d_{s}}$ in with the aim to relate the error to our function class of low-order interactions, see (12). To control the second term, we require assumptions on the sensitivity indices of the ANOVA terms in $U_{d_{s}}\setminus U$ . Since the error is only related to the structure of the function it can be considered independently of any specific approximation scenario like black-box or scattered data approximation. We show bounds for this error in the case that $f$ is an element of a Sobolev type space $\mathrm{H}^{w}(\mathbb{T}^{d})$ or a Wiener algebra $\mathcal{A}^{w}(\mathbb{T}^{d})$ and $H$ is $\mathrm{L}_{2}(\mathbb{T}^{d})$ or $\mathrm{L}_{\infty}(\mathbb{T}^{d})$ .

Theorem 6.1.

Let $f\in\mathrm{H}^{w}(\mathbb{T}^{d})$ with a weight function $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ and superposition dimension $\mathrm{d}^{(\mathrm{sp})}$ , see (10), for a $\delta\in(0,1)$ . If there exists a subset of ANOVA terms $U\subseteq U_{\mathrm{d}^{(\mathrm{sp})}}$ such that

[TABLE]

for every $\bm{u}\in U_{\mathrm{d}^{(\mathrm{sp})}}\setminus U$ then

[TABLE]

Proof 6.2.

The ANOVA truncation error can be separated as in (36). We prove an upper bound for the active set truncation. With Parseval’s equality and the assumption on the global sensitivity indices, we estimate

[TABLE]

*Clearly, we have $\sigma^{2}(f)\leq\left\|f\right\|_{\mathrm{L}_{2}(\mathbb{T}^{d})}^{2}\leq\left\|f\right\|_{\mathrm{H}^{w}(\mathbb{T}^{d})}^{2}$ . *

Theorem 6.3.

Let $f\in\mathcal{A}^{w}(\mathbb{T}^{d})$ with a weight function $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ . If there exsists a subset of ANOVA terms $U\subseteq U_{d_{s}}$ , $d_{s}\in\mathcal{D}$ , such that

[TABLE]

for every $\bm{u}\in U_{d_{s}}\setminus U$ and we have

[TABLE]

then

[TABLE]

Proof 6.4.

We split the ANOVA truncation error as in (36) and prove an upper bound for the second part. To this end, we estimate the $\mathrm{L}_{\infty}$ norm of $f$ by the absolute values of its Fourier coefficients and apply (38) to obtain

[TABLE]

*Naturally, it holds that $\sum_{\bm{k}\in\mathbb{Z}^{d}}\left|\mathrm{c}_{\bm{k}}\!\left(f\right)\right|\leq\left\|f\right\|_{\mathcal{A}^{w}(\mathbb{T}^{d})}$ which leads to the desired estimate. *

Note that in order to prove a bound for the error in $\mathrm{L}_{\infty}$ , we formulated a condition on an $\ell_{1}$ equivalent of the global sensitivity indices $\varrho(\bm{u},f)$ in accordance with the Wiener algebra norm.

6.2 Approximation error

In this section, we focus on the approximation error which we separate into two parts as well

[TABLE]

with $H\in\{\mathrm{L}_{2}(\mathbb{T}^{d}),\mathrm{L}_{\infty}(\mathbb{T}^{d})\}$ , a subset of ANOVA terms $U\subseteq\mathcal{P}(\mathcal{D})$ , and a finite frequency index set $I(U)\subseteq\mathbb{Z}^{d}$ of structure (30) with sets $I_{\bm{u}}$ as in (34). The truncation error remains independent of the approximation scenario and can be estimated by the norms in $\mathcal{A}^{w}$ and $\mathrm{H}^{w}$ .

Lemma 6.5.

Let $f\in\mathrm{H}^{w}(\mathbb{T}^{d})$ , $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ a weight function, and $I(U)\subseteq\mathbb{Z}^{d}$ a finite frequency index set of type (30) with $U\subseteq\mathcal{P}(\mathcal{D})$ . Then the relative truncation error can be estimated as

[TABLE]

If in addition we have $\sum_{\bm{k}\in\mathbb{Z}^{d}}\frac{1}{w^{2}(\bm{k})}<\infty$ , we can estimate

[TABLE]

Proof 6.6.

In order to prove (40) we employ Parseval’s identity and use the weight $w(\bm{k})$

[TABLE]

For the bound (41) we estimate the norm by the absolute sum of the Fourier coefficients and use the Cauchy-Schwarz inequality

[TABLE]

Lemma 6.7.

Let $f\in\mathcal{A}^{w}(\mathbb{T}^{d})$ with $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ a weight function such that $\sum_{\bm{k}\in\mathbb{Z}^{d}}\frac{1}{w^{2}(\bm{k})}<\infty$ , and $I(U)\subseteq\mathbb{Z}^{d}$ a finite frequency index set of type (30) with $U\subseteq\mathcal{P}(\mathcal{D})$ and sets $I_{\bm{u}}$ as in (34). Then the relative truncation error can be estimated as

[TABLE]

Proof 6.8.

*The proof requires similar steps to the proof of Lemma 6.5. *

For the aliasing error in (39), we start by considering the black-box approximation case where we solve the least-squares problem as described in Section 5.2.

Theorem 6.9.

Let $f\in\mathrm{H}^{w}(\mathbb{T}^{d})$ with a weight function $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ such that $\sum_{\bm{k}\in\mathbb{Z}^{d}}\frac{1}{w^{2}(\bm{k})}<\infty$ and $I(U)\subseteq\mathbb{Z}^{d}$ a finite frequency index set of type (30) with sets $I_{\bm{u}}$ as in (34). Moreover, we have a reconstructing rank-1 lattice $\Lambda(\bm{z},M,I(U))$ for a generating vector $\bm{z}\in\mathbb{Z}^{d}$ and lattice size $M\in\mathbb{N}$ . Then the aliasing error can be estimated as

[TABLE]

Furthermore, if $f\in\mathcal{A}^{w}(\mathbb{T}^{d})$ we get for the $\mathrm{L}_{\infty}$ -norm

[TABLE]

Proof 6.10.

We show the bound (42) by first applying Parseval’s identity and (3)

[TABLE]

We then incorporate the weight and utilize the Cauchy-Schwarz inequality to obtain

[TABLE]

From [43, Lemma 8.13] we know that for fixed $\bm{k}\in I(U)$ we have disjoint sets

[TABLE]

This means we are able to estimate

[TABLE]

such that

[TABLE]

Using that the sets $M_{\bm{k}}$ are disjoint and $\bigcup_{\bm{k}\in I(U)}M_{\bm{k}}\subseteq\mathbb{Z}^{d}\setminus I(U)$ yields

[TABLE]

The $\mathrm{L}_{\infty}$ -bound (43) is obtained similarly to the method used in the proof of [43, Theorem 8.14]. We proceed as follows

[TABLE]

*The result is obtained through estimating the sum by $\left\|f\right\|_{\mathcal{A}^{w}(\mathbb{T}^{d})}$ . *

In the following we consider the approximation error for scattered data approximation with a fixed node set $X\subseteq\mathbb{T}^{d}$ . Previously, we assumed that the index set $I(U)$ and the node set $X$ are such that the Fourier matrix $\bm{F}_{I(U)}$ has full rank. In this case the least-squares problem (29) has a unique solution. Assuming that the nodes in $X$ are i.i.d. random variables that are uniformly distributed in $\mathbb{T}^{d}$ , it is possible to achieve good bounds on the approximation error, see [2, 20, 29, 39].

Lemma 6.11.

Let $f\in\mathrm{H}^{w}(\mathbb{T}^{d})$ with a weight function $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ such that $\sum_{\bm{k}\in\mathbb{Z}^{d}}\frac{1}{w^{2}(\bm{k})}<\infty$ , $X\subseteq\mathbb{T}^{d}$ a finite set of i.i.d. uniformly distributed points, $\bm{y}=(f(\bm{x}))_{\bm{x}\in X}$ , and $I(U)\subseteq\mathbb{Z}^{d}$ a finite frequency index set of type (30) with $U\subseteq\mathcal{P}(\mathcal{D})$ a subset of ANOVA terms and sets $I_{\bm{u}}$ as in (34). If for the number of frequencies we have $\left|I(U)\right|\leq\frac{\left|X\right|}{7r\log\left|X\right|},r>0$ , then

[TABLE]

*with a probability of at least $1-3\left|X\right|^{1-r}$ for $\theta_{I(U)}=\left\|f-S_{I(U)}f\right\|_{\mathrm{L}_{2}(\mathbb{T}^{d})}$ and $\kappa=\frac{1+\sqrt{5}}{2}$ . *

Proof 6.12.

*The setting of this lemma is a special case of [39, Theorem 5.1]. *

The following theorem deals with the actual approximation error by incorporating the previous lemma.

Theorem 6.13.

Let $f\in\mathrm{H}^{w}(\mathbb{T}^{d})$ with a weight function $w\colon\mathbb{Z}^{d}\rightarrow[1,\infty)$ such that $\sum_{\bm{k}\in\mathbb{Z}^{d}}\frac{1}{w^{2}(\bm{k})}<\infty$ , $I(U)\subseteq\mathbb{Z}^{d}$ a finite frequency index set of type (30) with sets $I_{\bm{u}}$ as in (34). Moreover, $U\subseteq\mathcal{P}(\mathcal{D})$ , and $S_{I(U)}^{X}f$ are the corresponding approximate Fourier partial sum obtained through the scattered data approximation method described in Section 5.2. If the elements of $X\subseteq\mathbb{T}^{d}$ are i.i.d. random variables uniformly distributed on $\mathbb{T}^{d}$ and for the number of frequencies we have $\left|I(U)\right|\leq\frac{\left|X\right|}{7r\log\left|X\right|},r>0$ , then

[TABLE]

*with a probability of at least $1-3\left|X\right|^{1-r}$ for $\theta_{I(U)}=\left\|f-S_{I(U)}f\right\|_{\mathrm{L}_{2}(\mathbb{T}^{d})}$ and $\kappa=\frac{1+\sqrt{5}}{2}$ . *

Proof 6.14.

We denote the Fourier coefficients with $\hat{\bm{c}}=(\mathrm{c}_{\bm{k}}\!\left(f\right))_{\bm{k}\in I(U)}$ and the approximate Fourier coefficients computed by Algorithm 1 with $\hat{\bm{f}}=(\hat{f}_{\bm{k}})_{\bm{k}\in I(U)}$ . With Parseval’s identity as well as the Moore-Penrose inverse we obtain

[TABLE]

We use the properties of the spectral norm and estimate further

[TABLE]

Applying [39, Theorem 2.3] yields

[TABLE]

Finally, we use Lemma 6.11 to obtain our bound

[TABLE]

*with a probability of at least $1-3\left|X\right|^{1-r}$ . *

This concludes the consideration of the error of the presented method in both approximation scenarios. We were able to achieve bounds for $\mathrm{L}_{2}$ and $\mathrm{L}_{\infty}$ for functions in weighted Wiener algebras and Sobolev type spaces.

7 Numerical Results

We present numerical results for the method described in Section 5 for a test function $f\colon[0,1)^{9}\rightarrow\mathbb{R}$ ,

[TABLE]

where $B_{2}$ , $B_{4}$ and $B_{6}$ are parts of univariate, shifted, scaled and dilated B-splines of order 2, 4, and 6, respectively, see Figure 3 for illustration. Their Fourier series is given by

[TABLE]

with $\mathrm{sinc}(x)\coloneqq\sin(x)/x$ and the constants $c_{2}\coloneqq\sqrt{3/4}$ , $c_{4}\coloneqq\sqrt{315/604}$ , $c_{6}\coloneqq\sqrt{277200/655177}$ such that $\left\|B_{j}\right\|_{\mathrm{L}_{2}(\mathbb{T}^{d})}=1$ . This allows the direct computation of the Fourier coefficients $\mathrm{c}_{\bm{k}}\!\left(f\right)$ and the norm $\left\|f\right\|_{\mathrm{L}_{2}(\mathbb{T}^{d})}$ . The ANOVA terms $f_{\bm{u}}$ are only nonzero for

[TABLE]

The function $f$ therefore has an exact low-dimensional structure for $d_{s}=3$ , i.e., $\mathrm{T}_{3}f=f$ . This leads to $d_{s}=3$ being the optimal choice for the superposition threshold with no error caused by ANOVA truncation since it corresponds to the superposition dimension $\mathrm{d}^{(\mathrm{sp})}$ for $\delta=1$ , see (13). In an approximation scenario with an unknown function $f$ this information is of course not known.

We consider two errors

[TABLE]

Here, the error $\varepsilon_{\ell_{2}}$ can be regarded as a training error since it is taken at the given sampling set $X$ and the error $\varepsilon_{\mathrm{L}_{2}}$ as a type of generalization error since it measures the error in the Fourier coefficients. Since our goal is to find the important ANOVA terms, i.e., the terms in $U^{\ast}$ , we expect to have an interval (or gap) in which to choose the order-dependent threshold $\bm{\varepsilon}\in[0,1]^{d_{s}}$ . Therefore, we define

[TABLE]

with $1\leq j\leq d_{s}$ and

[TABLE]

Here, the assumption (27) is to be understood for every order of terms, i.e., for $\bm{u}$ and $\bm{v}$ with $\left|\bm{u}\right|=\left|\bm{v}\right|=j$ .

Remark 7.1.

The norm occurring in the error $\varepsilon_{\mathrm{L}_{2}}$ can be calculated using Parseval’s identity

[TABLE]

*which is possible since we know the exact Fourier coefficients and the norm of the function $f$ . In general, this error cannot be computed. *

7.1 Scattered Data Approximation

For our numerical experiments we use one sampling set $X\subseteq\mathbb{T}^{9}$ of uniformly distributed nodes with $M\coloneqq\left|X\right|=2.5\cdot 10^{6}$ , and an evaluation vector $\bm{y}=(f(\bm{x}))_{\bm{x}\in X}$ . We are going to start by choosing three as the superposition threshold $d_{s}$ while later reducing it to two which allows us to see the effect of truncating an ANOVA term. Our primary aim for now is to detect the ANOVA terms in $U^{\ast}$ which we achieve using the first step of our method, see Section 5.1. To this end, we choose a frequency index set $I(U_{d_{s}})\subseteq\mathbb{Z}^{9}$ , cf. (24), through order-dependent sets $I_{0}=\{0\}$ , $I_{1}=\{-N_{1}/2,\dots,N_{1}/2-1\}$ , $I_{2}=\{-N_{2}/2,\dots,N_{2}/2-1\}^{2}$ , and $I_{3}=\{-N_{3}/2,\dots,N_{3}/2-1\}^{3}$ with $N_{1},N_{2},N_{3}\in 2\mathbb{N}$ . The method gives us an approximation $S_{I(U_{d_{s}})}^{X}f$ .

Results of numerical experiments with the function $f$ from (44) and different choices for the bandwidths $N_{1},N_{2}$ , and $N_{3}$ are displayed in Table 1. They show that it is indeed possible to detect the ANOVA terms in $U^{\ast}$ using trigonometric polynomials of small degrees. Moreover, both errors are roughly of the same order. Since our number of samples $M$ is fixed, we are looking for values $\bm{N}$ such that one balances the effects of underfitting and overfitting. The experiments suggest that the choice in examples 5 and 8 is close to optimal. In Figure 4 we depicted the global sensitivity indices $\varrho(\bm{u},S_{I(U_{d_{s}})}^{X}f)$ , cf. Algorithm 1, for example 8 from Table 1. The one-dimensional sets $\{i\}$ , $i=1,\dots,9$ , all have large indices as they are all in $U^{\ast}$ while the two dimensional sets

[TABLE]

are clearly separated from the two dimensional sets in $U_{d_{s}}\setminus U^{\ast}$ . The same holds for the one three-dimensional term $\{4,8,9\}\in U^{\ast}$ . The size of the intervals $I^{(j)}$ suitable to choose the parameters $\varepsilon_{j}$ is especially relevant since it separates important from unimportant terms.

Since there exists $\bm{N}$ , and $\bm{\varepsilon}$ such that we are able to recover the set of ANOVA terms $U^{\ast}$ , we set $U_{X,\bm{y}}^{(\bm{\varepsilon})}=U^{\ast}$ from now on. We aim to improve our approximation quality with the given data by solving the minimization problem (35). Here, we could choose individual index sets for every ANOVA term in $U^{\ast}$ to form $I(U^{\ast})$ based on the global sensitivity indices, but for our function order-dependence can be maintained. Table 2 shows the results of the approximation using the index set $I(U^{\ast})$ .

The number of terms in $U^{\ast}$ is significantly smaller than in $U_{d_{s}}$ such that we are able to increase $\bm{N}$ while balancing the effects of over- and underfitting. We observe that the reduction of the ANOVA terms to $U^{\ast}$ yields benefit with regard to approximation quality due to the reduction in model complexity.

Now that we have experiments with no truncation error in the ANOVA decomposition, i.e., $\mathrm{T}_{3}f=f$ , we repeat the tests with a superposition threshold $d_{s}=2$ . In this case, it is not possible to detect the ANOVA term $f_{\{4,8,9\}}$ which results in the set $U^{+}\coloneqq U^{\ast}\setminus\{4,8,9\}$ being optimal for the detection step. For the following tests, we use the same nodes as we did previously.

The results of the experiments in Table 3 show that it is possible to determine the terms in $U^{+}$ . Since three-dimensional terms are not included, the term $f_{\{4,8,9\}}$ is not in the approximation which results in the larger errors compared to Table 1.

Since there exists $\bm{N}\in\mathbb{N}^{2}$ and $\varepsilon>0$ such that $U_{X,\bm{y}}^{(\bm{\varepsilon})}=U^{+}$ , we use $U^{+}$ for the next approximation step with suitable index sets $I(U^{+})$ . The results for different choices of $N_{1}$ and $N_{2}$ are displayed in Table 4. We are able to achieve better errors with the smaller index sets. Obviously, the influence of the cutoff error is dominating such that a large benefit in taking many additional frequencies cannot be observed.

7.2 Black-Box Approximation

In the following numerical experiments we aim to find reconstructing rank-1 lattice, see Section 2.1, for the function $f$ . In the first step, our goal is to determine the set of ANOVA terms $U^{\ast}$ and later use it to improve our approximation quality. As discussed in [28], the function $f$ works well with hyperbolic cross index sets of dominating mixed smoothness $3/2$ . Therefore, we define

[TABLE]

We choose as order-dependent index sets $I_{0}=\{0\}$ , $I_{1}=\mathcal{H}_{1}^{N_{1}}$ , $I_{2}=\mathcal{H}_{2}^{N_{2}}$ , and $I_{3}=\mathcal{H}_{3}^{N_{1}}$ with $N_{1},N_{2},N_{3}\in\mathbb{N}$ to obtain $I(U_{d_{s}})$ as in (23). The method then gives us a reconstructing rank-1 lattice $X\coloneqq\Lambda(\bm{z},M,I(U_{d_{s}}))$ with generating vector $\bm{z}\in\mathbb{Z}^{9}$ and lattice sizes $M\in\mathbb{N}$ by employing the component-by-component construction from [43, Algorithm 8.17]. The approximation is defined as $S_{I(U_{d_{s}})}^{X}f$ .

Table 5 shows results of numerical experiments with $f$ , see (44), and different choices for the parameters $N_{1},N_{2}$ , and $N_{3}$ . We can see that there exist an $\bm{\varepsilon}$ such that it is possible to detect the active set of terms $U^{\ast}$ in every test scenario. The lattice size increases with the growing index set as expected. Note that is sufficient to use an index set of 3481 frequencies and a lattice with only $46351$ evaluations in order to detect the active set of ANOVA terms.

Now, we set the active set $U_{X,\bm{y}}^{(\bm{\varepsilon})}=U^{\ast}$ . The aim is again to improve our approximation quality by solving the minimization problem (35). We also maintain the order-dependence of the set $I(U^{\ast})$ based on the structure of the function. Table 6 shows the results of the approximation using the index set $I(U^{\ast})$ . Larger cutoff parameters $N_{i}$ become possible such that we are able to achieve a good approximation error with relatively small lattice sizes in relation to our problem dimension. The sizes of our reconstructing lattices stay manageable as well.

8 Summary

In this paper we considered the classical ANOVA decomposition for periodic functions. We studied different index sets $\mathbb{P}_{\bm{u}}^{(d)}$ and $\mathbb{F}_{\bm{u}}^{(d)}$ for the projections $\mathrm{P}_{\bm{u}}f$ and ANOVA terms $f_{\bm{u}}$ , respectively, and proved their properties as well as formulas for the Fourier coefficients. For functions in Sobolev type spaces $\mathrm{H}^{w}(\mathbb{T}^{d})$ and the weighted Wiener algebra $\mathcal{A}^{w}(\mathbb{T}^{d})$ we showed that a function inherits its smoothness to both the projections and ANOVA terms.

Moreover, we related the smoothness of a function characterized by the decay of its Fourier coefficients to the class of functions of a low-dimensional structure and considered relative errors for $\mathrm{L}_{\infty}$ and $\mathrm{L}_{2}$ weighed by the corresponding Sobolev and Wiener algebra norms. This lead to an upper bound for the modified superposition dimension $\mathrm{d}^{(\mathrm{sp})}$ in those spaces. For product and order-dependent weights $w^{\alpha,\beta}$ we were able to obtain specific bounds.

We introduced an approximation method for high-dimensional functions that are of a low-dimensional structure in Section 5. The method can be employed in black-box and scattered data approximation. In the former scenario one needs a special discretization for the index sets of type $I(U)$ , e.g., rank-1 lattice, and in the latter an algorithm to realize an efficient multiplication with the Fourier matrices. We proved results for the error of the method, see Section 6, in $\mathrm{L}_{2}$ and $\mathrm{L}_{\infty}$ . An $\mathrm{L}_{\infty}$ bound in the scattered data case for the aliasing error in (39) is still open. Here, one needs to consider estimating the quantity

[TABLE]

Numerical experiments with a benchmark function were successfully performed in Section 7. The active set detection works well for this function in both approximation scenarios and even for small degrees of trigonometric polynomials. A definite goal is to perform experiments on real-world data sets and try to determine attribute rankings.

Moreover, it is possible to consider a similar analysis of the ANOVA decomposition in weighted Lebesgue spaces with orthogonal polynomials as bases, e.g., the Chebyshev system. This would also allow a generalization of the approximation method to a non-periodic setting, cf. [44].

Acknowledgments

First of all, we thank the referees for their valuable comments and suggestions. The authors also thank Lutz Kämmerer, Toni Volkmer, and Tino Ullrich for fruitful discussions and remarks on the contents of the paper. Moreover, we thank Tino Ullrich for comments on Theorem 6.13 and Lemma 6.11 which improved an earlier bound. DP acknowledges funding by Deutsche Forschungsgemeinschaft (German Research Foundation) – Project–ID 416228727 – SFB 1410. MS is supported by the BMBF grant 01 $|$ S20053A.

Bibliography48

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] J. Baldeaux and M. Gnewuch , Optimal randomized multilevel algorithms for infinite-dimensional integration on function spaces with ANOVA-type decomposition , SIAM J. Numer. Anal., 52 (2014), pp. 1128–1155, https://doi.org/10.1137/120896001 . · doi ↗
2[2] R. F. Bass and K. Gröchenig , Random sampling of multivariate trigonometric polynomials , SIAM J. Math. Anal., 36 (2004), pp. 773–795, https://doi.org/10.1137/S 0036141003432316 . · doi ↗
3[3] Å. Björck , Numerical Methods for Least Squares Problems , SIAM, Philadelphia, PA, USA, 1996.
4[4] H.-J. Bungartz and M. Griebel , Sparse grids , Acta Numer., 13 (2004), pp. 147–269, https://doi.org/10.1017/S 0962492904000182 . · doi ↗
5[5] G. Byrenheid, L. Kämmerer, T. Ullrich, and T. Volkmer , Tight error bounds for rank-1 lattice sampling in spaces of hybrid mixed smoothness , Numer. Math., 136 (2017), pp. 993–1034, https://doi.org/10.1007/s 00211-016-0861-7 . · doi ↗
6[6] R. Caflisch, W. Morokoff, and A. Owen , Valuation of mortgage-backed securities using Brownian bridges to reduce effective dimension , J. Comput. Finance, 1 (1997), pp. 27–46, https://doi.org/10.21314/jcf.1997.005 . · doi ↗
7[7] D. Dũng, V. N. Temlyakov, and T. Ullrich , Hyperbolic Cross Approximation , Advanced Courses in Mathematics – CRM Barcelona, Birkhäuser, Cham, 2018.
8[8] J. Dick, F. Y. Kuo, and I. H. Sloan , High-dimensional integration: The quasi-Monte Carlo way , Acta Numer., 22 (2013), pp. 133–288, https://doi.org/10.1017/S 0962492913000044 . · doi ↗

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Approximation of high-dimensional periodic functions with Fourier-based methods

Abstract

keywords:

1 Introduction

2 Prerequisites and Notation

2.1 Rank-1 lattice

3 The classical ANOVA decomposition of 1-periodic functions

Lemma 3.1**.**

Proof 3.2**.**

Lemma 3.3**.**

Proof 3.4**.**

Lemma 3.5**.**

Proof 3.6**.**

Lemma 3.7**.**

Proof 3.8**.**

Lemma 3.9**.**

Proof 3.10**.**

Corollary 3.11**.**

Proof 3.12**.**

Theorem 3.13**.**

Proof 3.14**.**

Remark 3.15**.**

3.1 Variance and Sensitivity

Lemma 3.16**.**

Proof 3.17**.**

Theorem 3.18** (Inheritance of smoothness for Sobolev type spaces).**

Proof 3.19**.**

Theorem 3.20** (Inheritance of smoothness for the weighted Wiener algebra).**

Proof 3.21**.**

4 Truncated ANOVA decomposition

Lemma 4.1**.**

Proof 4.2**.**

Corollary 4.3**.**

Proof 4.4**.**

Lemma 4.5**.**

Proof 4.6**.**

Lemma 4.7**.**

Proof 4.8**.**

Corollary 4.9**.**

Proof 4.10**.**

Theorem 4.11**.**

Proof 4.12**.**

Theorem 4.13**.**

Proof 4.14**.**

Corollary 4.15**.**

Proof 4.16**.**

Lemma 4.17**.**

Proof 4.18**.**

Corollary 4.19**.**

Proof 4.20**.**

5 ANOVA approximation method

5.1 Active set detection

5.2 Least-squares approximation

Remark 5.1**.**

5.2.1 Black-box scenario

Lemma 5.2**.**

Proof 5.3**.**

Lemma 5.4**.**

Proof 5.5**.**

Remark 5.6**.**

Lemma 5.7**.**

Proof 5.8**.**

Corollary 5.9**.**

Proof 5.10**.**

5.2.2 Scattered data scenario

5.3 Approximation with active set

6 Error analysis

6.1 ANOVA truncation error

Theorem 6.1**.**

Proof 6.2**.**

Theorem 6.3**.**

Proof 6.4**.**

6.2 Approximation error

Lemma 6.5**.**

Lemma 3.1.

Proof 3.2.

Lemma 3.3.

Proof 3.4.

Lemma 3.5.

Proof 3.6.

Lemma 3.7.

Proof 3.8.

Lemma 3.9.

Proof 3.10.

Corollary 3.11.

Proof 3.12.

Theorem 3.13.

Proof 3.14.

Remark 3.15.

Lemma 3.16.

Proof 3.17.

Theorem 3.18 (Inheritance of smoothness for Sobolev type spaces).

Proof 3.19.

Theorem 3.20 (Inheritance of smoothness for the weighted Wiener algebra).

Proof 3.21.

Lemma 4.1.

Proof 4.2.

Corollary 4.3.

Proof 4.4.

Lemma 4.5.

Proof 4.6.

Lemma 4.7.

Proof 4.8.

Corollary 4.9.

Proof 4.10.

Theorem 4.11.

Proof 4.12.

Theorem 4.13.

Proof 4.14.

Corollary 4.15.

Proof 4.16.

Lemma 4.17.

Proof 4.18.

Corollary 4.19.

Proof 4.20.

Remark 5.1.

Lemma 5.2.

Proof 5.3.

Lemma 5.4.

Proof 5.5.

Remark 5.6.

Lemma 5.7.

Proof 5.8.

Corollary 5.9.

Proof 5.10.

Theorem 6.1.

Proof 6.2.

Theorem 6.3.

Proof 6.4.

Lemma 6.5.

Proof 6.6.

Lemma 6.7.

Proof 6.8.

Theorem 6.9.

Proof 6.10.

Lemma 6.11.

Proof 6.12.

Theorem 6.13.

Proof 6.14.

Remark 7.1.