Measure Selection for Functional Linear Model
Su I Iao, Hans-Georg M\"uller

TL;DR
This paper introduces a flexible functional linear model that adaptively chooses the measure defining the function space, improving predictive accuracy over traditional models especially for complex data.
Contribution
It proposes a novel data-adaptive measure selection method for functional linear models, extending the framework beyond the standard Lebesgue measure.
Findings
Improved predictive performance with the adaptive measure approach.
Consistent outperformance over traditional models in simulations.
Effective application to COVID-19 and health survey data.
Abstract
Advancements in modern science have led to an increased prevalence of functional data, which are usually viewed as elements of the space of square-integrable functions . Core methods in functional data analysis, such as functional principal component analysis, are typically grounded in the Hilbert structure of and rely on inner products based on integrals with respect to the Lebesgue measure over a fixed domain. A more flexible framework is proposed, where the measure can be arbitrary, allowing natural extensions to unbounded domains and prompting the question of optimal measure choice. Specifically, a novel functional linear model is introduced that incorporates a data-adaptive choice of the measure that defines the space, alongside an enhanced function principal component analysis. Selecting a good measure can improve the model's predictive performance, especially when the…
| Scenario | Method | ||||
|---|---|---|---|---|---|
| I | FLM | 50 | 1.004 (0.215) | 1.000 (0.214) | 1.001 (0.217) |
| wFLM | 0.905 (0.218) | 0.813 (0.197) | 0.815 (0.200) | ||
| FLM | 100 | 0.975 (0.142) | 0.973 (0.142) | 0.973 (0.142) | |
| wFLM | 0.846 (0.154) | 0.722 (0.134) | 0.703 (0.137) | ||
| FLM | 200 | 0.967 (0.100) | 0.967 (0.101) | 0.968 (0.100) | |
| wFLM | 0.821 (0.118) | 0.668 (0.088) | 0.644 (0.082) | ||
| FLM | 500 | 0.958 (0.060) | 0.958 (0.061) | 0.958 (0.060) | |
| wFLM | 0.796 (0.084) | 0.636 (0.042) | 0.614 (0.039) | ||
| II | FLM | 50 | 0.084 (0.029) | 0.078 (0.027) | 0.076 (0.025) |
| wFLM | 0.057 (0.026) | 0.044 (0.024) | 0.038 (0.021) | ||
| FLM | 100 | 0.065 (0.018) | 0.059 (0.014) | 0.058 (0.014) | |
| wFLM | 0.041 (0.019) | 0.024 (0.014) | 0.022 (0.013) | ||
| FLM | 200 | 0.052 (0.011) | 0.049 (0.008) | 0.049 (0.008) | |
| wFLM | 0.031 (0.016) | 0.015 (0.008) | 0.013 (0.007) | ||
| FLM | 500 | 0.045 (0.004) | 0.044 (0.004) | 0.043 (0.004) | |
| wFLM | 0.022 (0.013) | 0.010 (0.003) | 0.008 (0.003) |
| Method | = 5-10 | = 20 | |
|---|---|---|---|
| FLM | 100 | 1000.49 (1267.99) | 782.28 (1063.72) |
| wFLM (Exp) | 116.09 (388.10) | 61.53 (435.21) | |
| wFLM (HalfNorm) | 105.80 (478.56) | 52.83 (385.37) | |
| FLM | 200 | 804.27 (1016.61) | 678.97 (956.70) |
| wFLM (Exp) | 82.08 (240.86) | 31.27 (330.86) | |
| wFLM (HalfNorm) | 76.13 (187.65) | 39.82 (324.44) |
| wFLM (Step) | wFLM (Exp) | FLM (Lebesgue) |
| 0.513 (0.809) | 0.263 (0.328) | 0.779 (1.027) |
| wFLM (Exp) | FLM (Lebesgue) |
| 1.022 (1.633) | 1.398 (8.814) |
| Method | ||||
|---|---|---|---|---|
| FLM | 50 | 0.032 (0.006) | 0.030 (0.004) | 0.033 (0.005) |
| wFLM | 0.180 (0.022) | 0.236 (0.017) | 0.422 (0.055) | |
| FLM | 100 | 0.030 (0.004) | 0.030 (0.003) | 0.035 (0.006) |
| wFLM | 0.224 (0.024) | 0.323 (0.024) | 0.693 (0.065) | |
| FLM | 200 | 0.031 (0.008) | 0.031 (0.004) | 0.040 (0.004) |
| wFLM | 0.342 (0.056) | 0.504 (0.051) | 1.223 (0.108) | |
| FLM | 500 | 0.033 (0.005) | 0.037 (0.011) | 0.057 (0.010) |
| wFLM | 0.667 (0.071) | 1.106 (0.246) | 2.785 (0.213) |
| Method | |||
|---|---|---|---|
| FLM | 100 | 0.076 (0.009) | 0.291 (0.041) |
| wFLM | 0.795 (0.125) | 4.395 (0.133) | |
| FLM | 200 | 0.218 (0.041) | 1.298 (0.147) |
| wFLM | 3.208 (0.126) | 17.759 (0.438) |
| 0.00 | 50 | 1.077 (0.168) | 1.077 (0.167) | 1.077 (0.166) |
|---|---|---|---|---|
| 100 | 0.998 (0.140) | 0.999 (0.140) | 0.999 (0.140) | |
| 200 | 0.983 (0.159) | 0.983 (0.159) | 0.983 (0.158) | |
| 500 | 0.968 (0.135) | 0.968 (0.135) | 0.968 (0.135) | |
| 0.25 | 50 | 1.086 (0.171) | 1.083 (0.178) | 1.090 (0.170) |
| 100 | 1.009 (0.144) | 1.003 (0.143) | 1.008 (0.143) | |
| 200 | 0.983 (0.155) | 0.983 (0.158) | 0.988 (0.157) | |
| 500 | 0.971 (0.136) | 0.970 (0.136) | 0.971 (0.136) | |
| 0.50 | 50 | 1.086 (0.172) | 1.082 (0.177) | 1.089 (0.169) |
| 100 | 1.010 (0.143) | 1.003 (0.143) | 1.008 (0.143) | |
| 200 | 0.983 (0.155) | 0.983 (0.158) | 0.988 (0.157) | |
| 500 | 0.971 (0.136) | 0.970 (0.136) | 0.971 (0.136) | |
| 0.75 | 50 | 1.088 (0.173) | 1.082 (0.179) | 1.088 (0.168) |
| 100 | 1.010 (0.143) | 1.003 (0.143) | 1.009 (0.145) | |
| 200 | 0.982 (0.155) | 0.983 (0.158) | 0.988 (0.157) | |
| 500 | 0.971 (0.136) | 0.970 (0.135) | 0.971 (0.136) | |
| 1.00 | 50 | 1.087 (0.173) | 1.082 (0.177) | 1.087 (0.167) |
| 100 | 1.010 (0.145) | 1.003 (0.142) | 1.009 (0.145) | |
| 200 | 0.983 (0.156) | 0.984 (0.158) | 0.987 (0.157) | |
| 500 | 0.972 (0.136) | 0.970 (0.135) | 0.971 (0.136) |
| 0.00 | 50 | 0.986 (0.177) | 0.858 (0.265) | 0.784 (0.275) |
|---|---|---|---|---|
| 100 | 0.863 (0.172) | 0.634 (0.206) | 0.605 (0.197) | |
| 200 | 0.825 (0.160) | 0.522 (0.126) | 0.511 (0.102) | |
| 500 | 0.806 (0.145) | 0.496 (0.076) | 0.496 (0.076) | |
| 0.25 | 50 | 0.987 (0.192) | 0.843 (0.247) | 0.814 (0.248) |
| 100 | 0.881 (0.179) | 0.661 (0.182) | 0.637 (0.182) | |
| 200 | 0.836 (0.168) | 0.565 (0.092) | 0.564 (0.119) | |
| 500 | 0.812 (0.148) | 0.553 (0.084) | 0.543 (0.084) | |
| 0.50 | 50 | 0.993 (0.190) | 0.922 (0.208) | 0.938 (0.216) |
| 100 | 0.896 (0.175) | 0.768 (0.170) | 0.746 (0.179) | |
| 200 | 0.840 (0.161) | 0.671 (0.103) | 0.651 (0.121) | |
| 500 | 0.823 (0.147) | 0.645 (0.095) | 0.623 (0.095) | |
| 0.75 | 50 | 1.009 (0.188) | 0.971 (0.193) | 0.977 (0.184) |
| 100 | 0.901 (0.165) | 0.862 (0.166) | 0.843 (0.169) | |
| 200 | 0.855 (0.159) | 0.772 (0.131) | 0.749 (0.135) | |
| 500 | 0.837 (0.145) | 0.719 (0.107) | 0.696 (0.104) | |
| 1.00 | 50 | 1.017 (0.191) | 0.997 (0.188) | 0.995 (0.180) |
| 100 | 0.917 (0.165) | 0.888 (0.160) | 0.874 (0.162) | |
| 200 | 0.867 (0.158) | 0.819 (0.127) | 0.806 (0.138) | |
| 500 | 0.853 (0.144) | 0.775 (0.114) | 0.760 (0.110) |
| Method | – | |||
|---|---|---|---|---|
| 0.00 | 100 | FLM | 836.47 (1167.38) | 800.22 (1340.22) |
| wFLM | 42.70 (64.91) | 4.07 (3.83) | ||
| 200 | FLM | 814.38 (1384.70) | 651.86 (968.43) | |
| wFLM | 32.38 (23.11) | 2.61 (2.10) | ||
| 0.25 | 100 | FLM | 788.45 (958.31) | 824.89 (1279.89) |
| wFLM | 96.52 (452.15) | 18.41 (111.21) | ||
| 200 | FLM | 898.71 (1457.29) | 671.07 (1042.29) | |
| wFLM | 40.83 (20.09) | 5.37 (8.50) | ||
| 0.50 | 100 | FLM | 912.36 (1156.95) | 802.88 (1164.14) |
| wFLM | 81.98 (112.51) | 56.91 (412.74) | ||
| 200 | FLM | 898.25 (1160.19) | 669.86 (958.95) | |
| wFLM | 66.28 (65.12) | 17.03 (73.01) | ||
| 0.75 | 100 | FLM | 942.42 (1182.26) | 873.84 (1357.36) |
| wFLM | 118.25 (220.73) | 57.63 (269.89) | ||
| 200 | FLM | 931.48 (1203.69) | 675.96 (924.48) | |
| wFLM | 110.87 (292.86) | 35.00 (153.27) | ||
| 1.00 | 100 | FLM | 1065.91 (1327.04) | 860.24 (1311.72) |
| wFLM | 143.09 (289.21) | 86.20 (422.79) | ||
| 200 | FLM | 924.58 (1142.45) | 690.19 (934.90) | |
| wFLM | 150.31 (503.18) | 54.13 (254.85) |
| Method | – | |||
|---|---|---|---|---|
| 0.00 | 100 | FLM | 559.87 | 372.53 |
| wFLM | 32.63 | 2.49 | ||
| 200 | FLM | 478.78 | 393.68 | |
| wFLM | 25.68 | 1.93 | ||
| 0.25 | 100 | FLM | 546.97 | 421.66 |
| wFLM | 44.14 | 4.09 | ||
| 200 | FLM | 472.03 | 366.03 | |
| wFLM | 36.48 | 3.16 | ||
| 0.50 | 100 | FLM | 648.08 | 423.55 |
| wFLM | 56.65 | 7.87 | ||
| 200 | FLM | 489.00 | 382.13 | |
| wFLM | 50.19 | 5.69 | ||
| 0.75 | 100 | FLM | 697.27 | 441.59 |
| wFLM | 70.63 | 12.33 | ||
| 200 | FLM | 544.98 | 393.73 | |
| wFLM | 63.05 | 10.10 | ||
| 1.00 | 100 | FLM | 797.91 | 454.03 |
| wFLM | 86.47 | 18.54 | ||
| 200 | FLM | 566.82 | 402.10 | |
| wFLM | 78.71 | 15.21 |
| Method | ||||
|---|---|---|---|---|
| FLM | — | 0.971 (0.136) | 0.970 (0.136) | 0.971 (0.136) |
| wFLM | 0.0 | 0.830 (0.143) | 0.645 (0.094) | 0.622 (0.094) |
| 0.1 | 0.938 (0.136) | 0.781 (0.201) | 0.770 (0.199) | |
| 0.2 | 0.949 (0.134) | 0.948 (0.134) | 0.949 (0.134) | |
| 0.3 | 0.951 (0.133) | 0.949 (0.130) | 0.950 (0.134) | |
| 0.4 | 0.953 (0.135) | 0.950 (0.131) | 0.950 (0.134) | |
| 0.5 | 0.953 (0.135) | 0.950 (0.131) | 0.950 (0.134) | |
| 1.0 | 0.953 (0.135) | 0.950 (0.131) | 0.950 (0.134) | |
| 1.5 | 0.954 (0.134) | 0.950 (0.132) | 0.950 (0.134) | |
| 2.0 | 0.954 (0.134) | 0.950 (0.132) | 0.950 (0.134) | |
| 3.0 | 0.954 (0.134) | 0.950 (0.132) | 0.950 (0.134) | |
| 4.0 | 0.954 (0.134) | 0.950 (0.132) | 0.950 (0.134) | |
| 5.0 | 0.954 (0.134) | 0.950 (0.132) | 0.950 (0.134) |
| Method | ||||
|---|---|---|---|---|
| FLM | — | 0.971 (0.136) | 0.970 (0.136) | 0.971 (0.136) |
| wFLM | 0.0 | 0.830 (0.143) | 0.645 (0.094) | 0.622 (0.094) |
| 0.1 | 0.823 (0.146) | 0.645 (0.095) | 0.621 (0.095) | |
| 0.2 | 0.823 (0.148) | 0.645 (0.095) | 0.621 (0.095) | |
| 0.3 | 0.823 (0.147) | 0.645 (0.095) | 0.621 (0.095) | |
| 0.4 | 0.823 (0.147) | 0.645 (0.095) | 0.621 (0.095) | |
| 0.5 | 0.823 (0.147) | 0.645 (0.095) | 0.621 (0.095) | |
| 1.0 | 0.823 (0.147) | 0.645 (0.095) | 0.621 (0.095) | |
| 1.5 | 0.819 (0.145) | 0.645 (0.095) | 0.621 (0.095) | |
| 2.0 | 0.819 (0.145) | 0.645 (0.095) | 0.621 (0.095) | |
| 3.0 | 0.819 (0.145) | 0.645 (0.095) | 0.621 (0.095) | |
| 4.0 | 0.819 (0.145) | 0.645 (0.095) | 0.621 (0.095) | |
| 5.0 | 0.819 (0.145) | 0.645 (0.095) | 0.621 (0.095) |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Advanced Statistical Methods and Models · Fault Detection and Control Systems
Measure Selection for Functional Linear Model
Su I Iao and Hans-Georg Müller111Department of Statistics, One Shields Ave., University of California, Davis, CA 95616, U.S.A. e-mail: [email protected]
Department of Statistics, University of California, Davis, One Shields Ave, Davis, 95616, CA, USA
Abstract
Advancements in modern science have led to an increased prevalence of functional data, which are usually viewed as elements of the space of square-integrable functions . Core methods in functional data analysis, such as functional principal component analysis, are typically grounded in the Hilbert structure of and rely on inner products based on integrals with respect to the Lebesgue measure over a fixed domain. A more flexible framework is proposed, where the measure can be arbitrary, allowing natural extensions to unbounded domains and prompting the question of optimal measure choice. Specifically, a novel functional linear model is introduced that incorporates a data-adaptive choice of the measure that defines the space, alongside an enhanced function principal component analysis. Selecting a good measure can improve the model’s predictive performance, especially when the underlying processes are not well-represented when adopting the default Lebesgue measure. Simulations, as well as applications to COVID-19 data and the National Health and Nutrition Examination Survey data, show that the proposed approach consistently outperforms the conventional functional linear model.
keywords:
Functional data analysis, weighted functional principal component analysis, weighted functional linear model, optimal measures.
††journal:
1 Introduction
Functional data have become increasingly prevalent with the advancement of modern data collection technologies. Typically, functional data are considered independent and identically distributed samples representing realizations of an underlying smooth stochastic process observed at discrete time points. Over the past decades, the field of functional data analysis (FDA) has garnered significant attention, particularly in connection with the successful deployment of methods such as functional principal component analysis (FPCA) (Kleffe, 1973; Castro et al., 1986; Yao et al., 2005a; Hall and Hosseini-Nasab, 2006; Chen and Lei, 2015) and functional linear models (FLM) (Ramsay and Silverman, 2005; Yao et al., 2005b; Hall and Horowitz, 2007). Comprehensive introductions and reviews can be found for example in Ramsay and Silverman (2005), Hsing and Eubank (2015), and Wang et al. (2016).
Both FPCA and FLM utilize the Hilbert space structure of , which is conventionally equipped with the Lebesgue measure to facilitate computation, where denotes the continuum of interest for the functional data. FLM is often implemented using an FPCA-based approach (Yao et al., 2005b; Hall and Hosseini-Nasab, 2006; Hilgert et al., 2013; Imaizumi and Kato, 2018), where FPCA is first applied to decompose functional predictors into orthogonal principal components. These principal components serve as low-dimensional representations that are subsequently used as covariates in a regression model. However, while this approach is widely used in applications (Liang et al., 2015; Chen et al., 2024; Iao et al., 2024; Zhou et al., 2024), it may not always provide the most effective representation of functional data. The success of FPCA-based FLM largely depends on the efficient representation of the coefficient function of FLM by the leading functional principal components (Cai and Yuan, 2012). In practice, the principal components of may not align well with the structure of the coefficient function, leading to suboptimal predictive performance. This issue parallels limitations observed in principal component regression (Jolliffe, 1982) and singular value decomposition techniques for linear inverse problems (Donoho, 1995). Notably, when low-variance components of carry non-negligible predictive power, discarding them can degrade model performance. These considerations motivate the development of alternative eigensystem constructions aimed at improving predictive accuracy and model interpretability.
A promising approach to address this issue is to introduce weighting schemes in functional data analysis. Prior research has explored weighted methods in FPCA (Leng, 2004; Talská et al., 2020), as well as in clustering and classification (Chen et al., 2014; Romano et al., 2020). By defining inner products with respect to an alternative measure, the resulting eigensystem can potentially yield a more effective representation of the coefficient function. Despite these developments, the application of weighted methodologies to functional linear models remains an open research question.
In this work, we propose a novel weighted functional linear model (wFLM) with functional predictors and scalar responses based on a data-driven measure. This framework is designed to operate in a Hilbert space equipped with a general measure, transcending beyond the classical default Lebesgue measure, and applies to both bounded and unbounded domains. By incorporating a general measure, the proposed approach enables a more flexible representation of functional data, leading to improved model interpretability and predictive performance. In addition to optimizing eigensystem alignment, the weighting approach that we propose here conveys additional benefits when dealing with infinite domains. When the domain is unbounded , the space imposes major constraints, as commonly used functions, like polynomials, are not situated in this space, while they are square-integrable when is finite. If trajectories do not lie in , traditional functional data analysis techniques such as FPCA and FLM are not applicable. Adopting a weighting scheme might also reflect that not all regions of a function’s domain are equally important or relevant for the analysis. Changing the uniform reference measure may be interpreted as emphasizing or downplaying the variability at some subdomains of the stochastic processes.
The rest of this paper is organized as follows. In Sections 2, we introduce the weighted functional principal component analysis and weighted functional linear model. The data-adaptive measure selections are established in Section 3. Simulations are shown in Section 4. Applications for COVID-19 data and the National Health and Nutrition Examination Survey data are discussed in Section 5.
2 Methodology
In this section, we introduce weighted functional principal component analysis (wFPCA) and then proceed to the wFPCA-based weighted functional linear model (wFLM).
2.1 Weighted functional principal component analysis
Revisiting classical functional principal component analysis (FPCA), let be a square integrable stochastic process on for which one has independent copies , . Mean and covariance function of are
[TABLE]
By Mercer’s Theorem (see Theorem 4.6.5 in Hsing and Eubank (2015)), the spectral decomposition of the covariance function is
[TABLE]
where are the eigenvalues and are the corresponding eigenfunctions of the auto-covariance operator. The latter form an orthonormal system on with respect to inner products based on the Lebesgue measure. The Karhunen-Loève representation implies that the th random curve can be represented as
[TABLE]
where the principal component scores are uncorrelated random variables with zero mean and variances .
An extension of classical FPCA involves defining inner products with respect to more general measures,
[TABLE]
where is an absolutely continuous measure with respect to Lebesgue measure and are -square integrable functions on the domain , see, e.g., Leng (2004); Chen et al. (2014); Talská et al. (2020). The Radon-Nikodym theorem (Ash and Doléans-Dade, 2000) ensures the existence of a measurable function , , which we refer to as weight function. The weight function is assumed to reside in the space
[TABLE]
To define the weighted FPCA, we assume that is square integrable on with respect to , i.e., and introduce a new square integrable process with respect to Lebesgue measure,
[TABLE]
where has and covariance function . The spectral decomposition of the covariance with respect to Lebesgue measure is
[TABLE]
with eigenvalues and eigenfunctions . The Karhunen-Loève representation of the process is
[TABLE]
where is the th principal component score of and . To ensure is well-defined, we set if . This leads to the following proposition, which appeared previously in Leng (2004).
Proposition 1**.**
Given a probability measure that is absolutely continuous with respect to the Lebesgue measure, where , and a stochastic process , the process is mean zero and square integrable with respect to the Lebesgue measure. For the eigenvalues and eigenfunctions of the process with respect to Lebesgue measure as per (3) and the functions
[TABLE]
it holds that the form the eigensystem of the original process with respect to the probability measure . The Karhunen-Loève representation of is given by
[TABLE]
with principal component scores
[TABLE]
The scores can equivalently be interpreted as principal component scores of the process under the Lebesgue measure or as principal component scores of the process under the probability measure .
All proofs are provided in the Supplementary Material. Given a general measure , Proposition 1 yields an easily implementable approach to obtain the Karhunen-Loève expansion of processes and the weighted FPCA of a process in . With the measure and random samples , one can follow the estimation procedures outlined in Yao et al. (2005a) and Zhang and Wang (2016) to obtain estimates , and further derive estimates , , and for the corresponding targets indexed by (where is the number of included eigenfunctions, which can be chosen by leave-one-out cross-validation, see Section 3).
Proposition 1 relies on two key assumptions that are standard and well-motivated in functional data analysis: (1) The stochastic process resides in the Hilbert space , i.e., it is square integrable with respect to the general measure . This assumption ensures the existence of well-defined mean and covariance functions and guarantees the applicability of the Karhunen-Loève expansion and the Hilbert space structure of provides the basis for eigen-analysis, including completeness, a well-defined inner product, and the existence of an orthonormal basis. (2) The measure is absolutely continuous with respect to the Lebesgue measure, so that the Radon-Nikodym derivative exists. This is a mild condition, commonly satisfied in practical applications where the weighting function is derived from data or design considerations. In particular, it is satisfied by our proposed data-driven measure, which is constructed to be absolutely continuous by design; see Section 3 for further details. Intuitively, absolute continuity ensures that does not assign positive mass to any set that has zero Lebesgue measure, so no information carried by is lost when working with Lebesgue integrals, allowing for the analysis of the transformed process in the standard setting.
More generally, if we consider two measures and , where is absolutely continuous with respect to , with Radon-Nikodym derivative , one can conduct a weighted FPCA within the space by means of the space . The following proposition extends Proposition 1 to this more general setting.
Proposition 2**.**
Given two general measures , and a stochastic process , if we assume is absolutely continuous with respect to , then belongs to . Denote the eigenvalues and eigenfunctions of the process with respect to as and define a new function
[TABLE]
Then, the eigensystem of within is and the th principal component score of the process with respect to the measure is
[TABLE]
Proposition 2 generalizes Proposition 1 by establishing a mapping between eigensystems defined under two arbitrary measures, provided one of these measures absolutely continuous with respect to the other. This result enables the construction of a weighted FPCA framework in by leveraging the eigendecomposition of a rescaled process in a potentially simpler or more tractable space . The key idea is that the Radon-Nikodym derivative determines how the geometry of the space, and hence the structure of the principal components, transforms across different weighting schemes.
When in Proposition 2 is the Lebesgue measure and is an absolutely continuous measure with Radon-Nikodym derivative , Proposition 1 emerges as a special case, where the transformed process and the reweighted eigenfunctions match those in Proposition 1.
2.2 Weighted functional linear model
Consider a general measure which is absolutely continuous with respect to the Lebesgue measure, such that there exists a weight function . Let be a random pair in , where is a functional predictor and a scalar response, where and , and variance and covariance as per (1). Suppose are independent realizations of . In this section, we consider a weighted functional linear model (wFLM) in which are generated by the model
[TABLE]
Here the regression function is smooth and square integrable, i.e., .
Centering predictor processes , the functional linear regression model becomes
[TABLE]
Consider the transformed processes , the wFLM is equivalent to
[TABLE]
where . The regression parameter function can be represented as (Yao et al., 2005b; Hall and Horowitz, 2007)
[TABLE]
where is the eigensystem of the process as per (3), are the th principal component scores of the process as per (5), and . Transforming back to , we obtain the representation
[TABLE]
One can use a well-established local linear smoothing approach to obtain an estimate of the cross-covariance surface
[TABLE]
This leads to the estimators
[TABLE]
where is number of included eigen-components, which is a tuning parameter, , . Further details about this smoothing approach to obtain estimates of the eigen-components and coefficient functions can be found, e.g., in Yao et al. (2005b).
To predict the scalar response from a new predictor trajectory , we ultilize the equation (6), the basis representation of as per (7) and the orthonormality of the . The prediction of the response can be obtained via the conditional expectation
[TABLE]
where
[TABLE]
is the th functional principal component score of the predictor trajectory . The quantities , , , can be estimated from the data, as described in Yao et al. (2005a, b) and Zhang and Wang (2016).
3 Choosing the weight function for the functional linear model
So far the weight function, was assumed to be given. In practical applications, selecting a good weight function from the available data is crucial. Ideally, we aim to find the optimal weight function within a set of potential weight functions as per (2). The objective is to minimize the cross-validation error,
[TABLE]
where is the cross-validation prediction for the th subject and is the estimate of as per (7). Here, the superscript denotes leave-one-out estimation, where the th sample is omitted from the estimation process. To accomplish this goal, we present two practical approaches for selecting optimal weight functions tailored to different types of domain , aiming to up-weigh or down-weigh subdomains that are more or less important for obtaining good predictions when applying the functional linear model.
3.1 Step function approach on the finite domain
Finding an analytical solution for the optimal weight function in Equation (11) is challenging. To efficiently obtain approximate solutions for (11) in practical applications, we employ a dyadic splitting algorithm (Leng, 2004). For the sake of completeness, details about this algorithm are included in the Supplementary Material. This algorithm results in weight functions that are step functions.
We search for the optimal weight function within the subset , where
[TABLE]
Here, is the number of steps and is the number of times that we split the interval. To ensure that the resulting weight function is interpretable and has no abrupt jumps, we consider a penalized cross-validation score (13),
[TABLE]
where is the total variation of and , , are tuning parameters, denoting the number of included components.
For the selection of tuning parameters, we employed cross-validation to simultaneously select and . To ensure computational efficiency, we limit the number of candidate values for , , and to expedite the cross-validation procedure. For , we consider candidate values ranging from 1 to , where is the best value for in the FLM under the Lebesgue measure according to the cross-validation error; for and , we consider the values 0, 0.5 and 1. For a comprehensive sensitivity analysis for the choice of the tuning parameters and we refer to Section S.8 of the Supplementary Material.
3.2 Parametric density approach on the infinite domain
For infinite domains, we adopt a parametric approach for selecting the weight function . Specifically, we consider density functions whose support aligns with or to ensure that the resulting weighted space remains well-defined. We focus on the case , as extensions to the case are analogous. For , suitable choices include distributions from the exponential family, such as the exponential, half-normal, gamma and truncated normal distributions. These parametric choices incorporate prior knowledge or a desired emphasis on specific subregions of the domain. In our applications, we focus on the exponential density for due to its interpretability, single parameter which controls the decay rate and its strong empirical performance. The exponential density places more weight near the origin and decays monotonically, which is often appropriate in functional data where signal strength may diminish over time. We also considered the half-normal distribution in our simulation studies, where for . The half-normal distribution also defines decreasing weights over , and while its rate of decay differs from that of the exponential distribution, both densities asymptotically approach zero as . As demonstrated in Section 4.2, the predictive performances of exponential and half-normal weights are comparable, suggesting robustness to specific choices. For each choice we selected the optimal parameter via cross-validation using the criterion in Equation (11).
While the exponential density is emphasized in our applications and the half-normal is included in our simulations, the proposed framework is not restricted to these choices. Weight functions derived from other parametric distributions such as the gamma distribution could also be incorporated. These alternatives offer additional flexibility. Both gamma and truncated normal distributions may place more emphasis on mid- or late-domain regions rather than near the origin, which may be beneficial in settings where important information is concentrated away from . Although these densities differ in shape near the center, they all exhibit exponential decay as , ensuring stability over unbounded domains. Thus, the gamma distribution and truncated normal distribution may be suitable alternatives when emphasizing mid-to-late domain regions is desirable. Among these different options, the exponential density provides a computationally efficient and conceptually straightforward baseline. Nonetheless, the proposed framework is flexible and can accommodate weight functions derived from other parametrically specified distributions.
4 Simulation studies
4.1 Simulations on
We conducted simulation studies evaluating weight functions for two distinct measures: the Lebesgue measure (uniform density) and an optimal measure as approximated by a step function. These investigations comprised two separate simulation scenarios, each encompassing Monte Carlo runs. In each scenario, we considered settings with ranging from i.i.d. pairs, consisting of a response scalar and a predictor trajectory, as well as varying numbers of measurements per predictor trajectory ; the locations where these measurements were taken were equidistant within the interval .
The predictor trajectories, denoted as with corresponding noisy measurements , were generated as follows. For both scenarios, the simulated processes had mean function and covariance functions were constructed using 10 eigenfunctions such that
[TABLE]
For Scenario 1 we chose eigenvalues for . We generated functional principal component scores from and obtained the predictor measurements
[TABLE]
where the additional measurement errors followed a normal distribution with mean 0 and variance , and . The scalar responses were generated according to where , with for , and the additional measurement errors for the responses followed a normal distribution with mean 0 and variance . Processes and all errors were independent in both Scenario 1 and Scenario 2.
In Scenario 2 we chose eigenvalues for . We generated FPC scores from , and calculated the predictor measurements again as in (14), where all errors were obtained in the same way as in Scenario 1. Here the scalar responses were generated as , where , with as in Scenario 1 and the weight function was specified as
[TABLE]
For the th Monte Carlo run, we generated 100 new noisy predictors and 100 corresponding noise-free responses . We evaluated the predictive performance using the average mean squared prediction error (AMSPE)
[TABLE]
where represented the predicted responses estimated by either FLM or wFLM.
Table 1 presents the results for both scenarios. In Scenario 1, wFLM (step) consistently outperformed FLM (Lebesgue) in terms of AMSPE, with larger gains observed for increased sample sizes and measurement points. These results suggest that when the coefficient function cannot be efficiently represented using the leading functional principal components, using the default Lebesgue measure in the FLM may be suboptimal and a more general step function-approximated measure may entail a more suitable eigensystem to efficiently represent the regression parameter function , especially when sample size and number of measurement points are relatively large. Furthermore, in Scenario 2, wFLM (step) also demonstrated superior predictive performance across all settings, with notable improvements when , achieving reductions in MSPE of , and for numbers of measurements .
4.2 Simulations on
To evaluate the performance of the proposed weighted functional linear model (wFLM) on an unbounded domain, we conducted a simulation study over . In addition to the exponential density, which aligns with the true underlying measure used to generate the data, we also included the half-normal density to assess the robustness of the method regarding the choice of weighting function. We compared the predictive performance of wFLM with exponential weight function, wFLM with half-normal weight function and the classical FLM that utilizes the Lebesgue measure. We considered four settings with sample sizes ranging from and two choices for the number of measurements with Monte Carlo runs. The number of measurements was either set to or chosen randomly for each predictor trajectory with equal probability from . The locations of the measurements were exponentially distributed with a rate over the infinite interval , reflecting unbounded support with irregular and potentially sparse sampling.
The predictor trajectories and associated noisy measurements were generated as follows. The simulated processes had mean function and covariance function constructed using a set of 9 eigenfunctions (for more details we refer to Section S.4 in the Supplementary Material), which are orthonormal in where is the density of the standard exponential distribution. We chose the eigenvalues and , and as variance of the additional measurement errors , which were assumed to be normal with mean 0. For each sample , we generate FPC scores from and obtained predictor measurements, . The scalar responses were generated by , where and . As before, for each Monte Carlo run we generated 100 new noisy predictors and 100 corresponding noise-free responses .
Table 2 reports the average mean squared prediction errors and standard deviations across the simulations. Here both wFLM approaches dramatically outperform the classical FLM for the irregular and sparse measurement settings. For instance, when and , wFLM (Exp) reduces AMSPE by 88.4% compared to the basic FLM, while wFLM (Half-Normal) yields similar gains (89.0%), despite the fact that the true data-generating measure here is the standard exponential density. This demonstrates a certain robustness of our method regarding the specific choice of a weight function derived from a parametric distribution. Even in relatively irregular dense settings, e.g., when , both weighting schemes substantially improve prediction accuracy. As expected, performance improves with larger sample size and both wFLM approaches maintain a clear advantage over FLM. These results highlight the flexibility and reliability of the proposed framework for unbounded domains and irregular measurement patterns.
5 Applications
5.1 Predicting COVID-19 new cases
We illustrate the performance of the proposed method with COVID-19 data. Functional data analyses for time-dynamic data of COVID-19 cases have been conducted previously (Carroll et al., 2020; Dubey et al., 2022). We obtained daily confirmed cases across countries from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. These data are publicly available at https://github.com/CSSEGISandData/COVID-19. The data feature the cumulative number of confirmed cases for each country from January 22, 2020, to March 3, 2023 and were accessed on April 11, 2023. For the analysis, we focused on the period from July 1, 2020, to December 31, 2022 (a total of 914 days), and used the seven-day moving average of daily confirmed cases per million as functional predictor. The scalar response was taken as the total confirmed cases from January 1 to January 31 in 2023.
The seven-day moving averages of daily confirmed cases per million people from July 1, 2020 to December 31, 2022 for 29 countries are displayed in Fig 1. The 29 countries for which data were included were located in America or Europe, as they exhibited similar COVID-19 response policies and relatively low bias in reported cases, and countries with zero cases in the 914-day trajectory were excluded.
We investigated the performance of both the selection of a weight function as a step function and as an exponential density. We applied the infinite domain method here as the functional predictor spans a fairly long duration and as it is reasonable to assume that the closer the data points are to the end of 2022, the more crucial their influence becomes for the subsequent total confirmed cases in January 2023. Therefore, it is reasonable to assign higher importance to the domain at the end of 2022 while assigning relatively lower weight to the data for earlier periods. To implement this strategy within the framework of a weight function derived from the exponential distribution, we encoded December 31, 2022, as and July 1, 2020, as . This coding scheme ensured that data with measurement times closer to received more weight than those measured earlier. In contrast, when implementing weight functions as step functions we retained the original domain, with representing July 1, 2020 and representing December 31, 2022.
To compare the performance of the FLM with weight function selection with the original FLM that uses the Lebesgue measure and therefore a constant weight function, we employed a leave-one-out cross-validation score, see Table 3. The leave-one-out cross-validation score is
[TABLE]
where represents the predicted value from the model after omitting the th country from the training data.
The optimal number of principal components, i.e., the minimizer of the cross-validation score for the classical FLM was found to be , while the optimal M for the step weight function and the weight function derived from the exponential distribution were 3 and 2. Table 3 reveals that wFLM (step) and wFLM (exp) achieve better prediction performance, resulting in a and improvement in prediction accuracy compared to the classical FLM with the Lebesgue measure. It clearly emerges that wFLM achieves better prediction accuracy in this application while using fewer principal components as compared to FLM.
5.2 National Health and Nutrition Examination Survey
Behavioral scientists are interested in analyzing the association between cardiovascular risk factors (such as systolic blood pressure and total cholesterol) and physical activity (Luke et al., 2011; Gerage et al., 2015; Ledbetter et al., 2022; Ge et al., 2024). We apply the proposed method to model the effect of physical activity intensity on systolic blood pressure, utilizing data from the National Health and Nutrition Examination Survey (NHANES) 2005-2006; these data are publicly available at https://wwwn.cdc.gov/nchs/nhanes/ContinuousNhanes/Default.aspx?BeginYear=2005. NHANES assesses the health and nutrition status of U.S. adults and children through comprehensive interviews and physical examinations. The survey collects information on demographic, socioeconomic, dietary, and health-related variables, along with medical, dental and physiological measurements. As part of NHANES, participants aged six and older were asked to wear an Actigraph 7164 accelerometer on a waist belt for seven consecutive days, capturing physical activity intensity every minute throughout the day. These accelerometer data have been widely used by researchers to explore the relationship between activity patterns and various health indicators (Troiano et al., 2014; Tudor-Locke et al., 2012). Worn on the right hip, the accelerometer began recording at 12:01 am the day after the participant’s health examination and was removed only during sleep, swimming or bathing.
We restricted our analysis to a subset of male participants who were married, aged over 20 and had four complete blood pressure measurements. This led to a sample size of participants. Denoting the physical activity intensity function at minute of the th participant by , we observe that its domain is , where is in minutes and stands for the total number of minutes over the days where the signal was recorded. We transform the to define the predictor for the th subject as for a given physical activity intensity level . This function represents the total time in minutes during which the physical activity intensity equals over the days of observation. This is possible since the activity levels are discrete. Similar transformations of the physical activity intensity have been considered previously by various authors Chang and McKeague (2022); Lin et al. (2023), as this transformed function provides a better reflection of the actual activity than does.
The response of interest is the average systolic blood pressure, averaging over the four available measurements. The potential for large values of physical activity intensity , which serves as argument of the , means that the domain has no clear upper bound, motivating to consider an infinite domain for the functional predictor. We investigate the performance of the proposed wFLM with a weight function derived from the exponential standard distribution in comparison with the ordinary FLM. Here it is reasonable to implement the exponential density weight, as the majority of the physical intensity values are small. It is natural that most of the time, people will engage in sedentary behavior or light physical activity, and rarely have high-intensity values and therefore low levels of physical activity should receive more weight. By cross-validation we found the optimal number of principal components for the ordinary FLM to be . With , Table 4 reveals that wFLM (exp) achieves better prediction performance, resulting in a improvement in prediction accuracy compared to FLM (Lebesgue).
6 Discussion
In this paper, we introduced a weighted functional linear model that generalizes the conventional functional linear model by incorporating a data-driven, optimal measure for defining the Hilbert space. This modified model is shown to achieve better predictive performance by emphasizing more relevant regions of the functional domain and thus improving the representation of the coefficient function. Furthermore, this framework naturally extends to infinite domains, addressing challenges in functional data analysis for unbounded domains where traditional methods may struggle.
Through simulation studies and real data applications, we demonstrated that the approach consistently outperforms the standard functional linear model, offering a more flexible and powerful framework for functional linear regression. We also provide basic representations and relationships of eigen-systems for weighted and unweighted functional principal component analysis.
The proposed method could also be harnessed for other tasks in functional data analysis, such as generalized functional linear regression and functional classification (Müller and Stadtmüller, 2005), among others. Future work may explore alternative methods for selecting optimal measures and expanding the model to accommodate more complex functional data structures.
Acknowledgments
This research was supported in part by NSF grant DMS-2310450. We thank the referees for helpful comments.
S.1 Proof of Proposition 1
Proof.
We show that form an orthonormal system with respect to ,
[TABLE]
Denoting the auto-covariance operator of with respect to , i.e., in the space by .
[TABLE]
Next we show that the and are the eigenfunctions and eigenvalues of , Then
[TABLE]
Finally, it is easy to show that the functional principal component scores of process in are the same as those of process in ,
[TABLE]
∎
S.2 Proof of Proposition 2
Proof.
Let be . We first show that form an orthonormal system with respect to ,
[TABLE]
since are eigenfunctions of the process . Denoting the auto-covariance operator of with respect to , i.e., in the space by .
[TABLE]
Next we show that the and are the eigenfunctions and eigenvalues of . For this we observe
[TABLE]
Finally, it is easy to show that the functional principal component scores of process in are the same as those of process in ,
[TABLE]
∎
S.3 Dyadic splitting algorithm
Initialization: In the first step, we divide the interval into two subintervals: and . We seek a constant weight for that minimizes the cross-validation mean square prediction error. The weight for is determined automatically based on the constraints imposed on the weight function.
Refinement: Following the initialization, we possess weights for both and . We further split into two equal subintervals: and . While keeping unchanged on , we search for a weight on as in the first step, with automatically determined.
We then perform a similar procedure for from the initialization step, splitting it into and , while retaining the weights on the other intervals. This results in weights on and on , automatically adjusted based on the constraints.
Updating Step: At the th step, where there are intervals, we iteratively split each interval from the previous step at its midpoint. We determine the corresponding weights as constants on the left and right subintervals, aiming to minimize the cross-validation mean square prediction error.
Termination: The iteration continues until further splitting fails to reduce the cross-validation mean square prediction error or until it reaches the maximum allowable number of splitting steps (typically set to 3). At this point, we conclude the algorithm, and the current weight function is designated as the output.
S.4 Othonormal basis function in with
[TABLE]
S.5 Training time and computational complexity
We present a detailed comparison of the training time for FLM and wFLM under various sample sizes and numbers of measurement points. Tables S.1 and S.2 report average runtime (in minutes) over 100 repetitions, for wFLM (step) under a bounded domain and wFLM (Exp) under a unbounded domain, respectively. All computations were performed on a local machine equipped with an Apple M2 processor running macOS Sequoia.
Table S.1 reports the average training time for classical FLM and the step-based wFLM under Scenario 1 of Section 4.1, where the domain is bounded, . The results demonstrate that the step-based wFLM method introduces additional computational cost compared to FLM, but the increase is moderate and scales reasonably with both the sample size and the number of measurement points . Table S.2 presents training times for FLM and wFLM with exponential weights in the unbounded domain setting, , described in Section 4.2. Here, the computational cost is higher. There are two main reasons for this. First, the predictor measurements are irregularly spaced over an infinite interval, which increases computational burden compared to evenly spaced and bounded designs. Second, wFLM (Exp) involves grid searching over a set of candidate parameters, which adds further cost.
Although wFLM requires more computational resources, it consistently outperforms classical FLM in all simulation settings and applications. In particular, for unbounded domains, FLM suffers from fundamental theoretical limitations. The space excludes many commonly used functions, such as polynomials, and thus cannot adequately support standard FPCA or FLM procedures. As shown in Section 4.2 and Section S.7 of the Supplementary Material, FLM performs poorly in these scenarios, whereas wFLM remains stable and accurate. Moreover, even in bounded domains where the true underlying measure is the Lebesgue measure, as in Scenario 1 of Section 4.1 and Supplementary Section S.6, wFLM still yields superior prediction accuracy when the leading functional principal components fail to capture the signal structure effectively. The additional computational cost of wFLM is therefore justified by its substantial gains in predictive performance and theoretical soundness.
S.6 Sensitivity analysis regarding measurement error variance for step-based wFLM
To evaluate the robustness of the proposed weighted functional linear model with step function weight under varying levels of measurement error, we conducted additional simulations based on Scenario 1 in Section 4.1. The functional predictors were generated using the same mean and covariance structure as described in the main simulation setting. Specifically, the mean function was defined as
[TABLE]
and the covariance function was constructed using 10 eigenfunctions with corresponding eigenvalues . The eigenfunctions were given by
[TABLE]
For each subject , the functional trajectory was constructed as
[TABLE]
where and represent i.i.d. measurement errors. We examined five levels of measurement error variance: . The scalar responses were generated via the functional linear model:
[TABLE]
where and .
We compared FLM and wFLM across sample sizes and grid resolutions . Prediction performance was measured using average mean squared prediction error (AMSPE), averaged over 200 Monte Carlo replicates.
The results in Tables S.3 and S.4 show that wFLM consistently achieves lower prediction error than FLM across all levels of measurement error, sample sizes, and numbers of measurements. While FLM exhibits relatively stable performance as measurement error increases, its overall accuracy remains limited. In contrast, wFLM demonstrates strong predictive performance in low-noise settings and retains its advantage even as noise levels grow. These findings highlight the robustness of the proposed weighting scheme and underscore the benefit of adapting an optimized measure rather than the default Lebesgue measure across many settings.
S.7 Sensitivity analysis regarding measurement error variance for weight functions derived from parametric distributions
To further assess the robustness of the proposed wFLM (Exp) method on the unbounded domain , we conducted additional simulations based on the setting in Section 4.2, now incorporating varying levels of measurement error. Specifically, we varied the standard deviation of the additive noise with . All other components of the data-generating process, including the eigenbasis functions , the exponential measurement locations and the scalar response model are the same as described in Section 4.2.
We evaluated performance under two settings for the number of measurement points per trajectory: either fixed at or randomly sampled from with equal probability. The locations of the measurements were exponentially distributed with a rate over the infinite interval . Average mean squared prediction error (AMSPE) was computed over Monte Carlo runs for each setting and method.
Table S.5 reports the average mean squared prediction error (AMSPE) and associated standard deviations. Across all settings, wFLM (Exp) consistently outperforms classical FLM, often by a large margin. Notably, while the prediction performance of FLM remains relatively stable across different noise levels, it is uniformly worse than wFLM, especially in low-noise or denser sampling regimes. In contrast, wFLM exhibits strong gains when the signal is recoverable, and degrades modestly under increasing noise. To assess the impact of potential outliers in AMSPE due to heavy noise or extreme trajectories, we also report the median mean squared prediction error in Table S.6. The results for median errors confirm and strengthen the trends observed in the results for mean errors: wFLM achieves drastically lower median errors than FLM, particularly in the setting, where the signal is better captured. These results demonstrate that the exponential weighting scheme not only improves prediction but also enhances robustness to measurement noise and sparsity.
S.8 Sensitivity analysis regarding the choice of tuning parameters and
To evaluate the sensitivity of the step-based wFLM regarding the choice of tuning parameters and , we conducted additional simulations under Scenario 1 of Section 4.1 in the main text. We fixed the sample size at and considered three settings for the number of measurements . The measurement locations were equally spaced over , and the predictor trajectories were generated with the same mean and covariance structure as described in the main text.
The parameter penalizes the total variation of the step function weights, encouraging smoother transitions between adjacent intervals. A higher value of flattens the weight function and for large values forces it to move closer to a uniform weight function and thus towards the classical FLM. The parameter penalizes the number of non-zero subintervals in the weight function. This promotes sparsity by allowing parts of the domain to be entirely down-weighted, which can help isolate the most informative regions and improve prediction.
We conducted two separate experiments. In the first, we fixed and varied from [math] to . In the second, we fixed and similarly varied . The results are summarized in Tables S.7 and S.8, showing the average mean squared prediction error (AMSPE) and standard deviations over Monte Carlo replications.
When is fixed at zero, increasing leads to a noticeable increase in prediction error between and , after which the performance stabilizes. This pattern suggests that even a small amount of total variation penalization can quickly push the step-function weight toward uniformity, diminishing its ability to adapt to the underlying signal. In this simulation setting, the Lebesgue measure is known to be suboptimal because the coefficient function cannot be effectively represented by the leading principal components. As increases, the weight function becomes less adaptive and more uniform, resembling the classical FLM. This rigidity prevents the model from exploiting beneficial flexibility in weight function selection, which explains the degradation in predictive performance. Conversely, when is fixed at zero, increasing leads to small gains in prediction accuracy. While the true underlying measure for this simulation is the Lebesgue measure, the inability of the leading principal components to capture motivates alternative weighting. By allowing the model to concentrate weight on more relevant subregions, the step-based wFLM is able to adapt better to the signal structure.
Importantly, for users whose primary goal is predictive accuracy, we recommend setting , which removes the variation penalty and reduces computational burden. This configuration allows the model to explore more flexible weight structures without constraints. However, if interpretability of the learned measure is also a concern, such as avoiding abrupt shifts between adjacent intervals, a small positive can help smooth the estimated weights. Overall, these findings suggest that the model is reasonably robust to tuning choices within a practical range, and the penalization framework provides users with the flexibility to balance prediction performance and interpretability depending on their analytical goals.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Ash and Doléans-Dade (2000) Ash, R.B., Doléans-Dade, C.A., 2000. Probability and Measure Theory. Academic Press.
- 2Cai and Yuan (2012) Cai, T.T., Yuan, M., 2012. Minimax and adaptive prediction for functional linear regression. Journal of the American Statistical Association 107, 1201–1216.
- 3Carroll et al. (2020) Carroll, C., Bhattacharjee, S., Chen, Y., Dubey, P., Fan, J., Gajardo, Á., Zhou, X., Müller, H.G., Wang, J.L., 2020. Time dynamics of COVID-19. Scientific Reports 10, 21040.
- 4Castro et al. (1986) Castro, P.E., Lawton, W.H., Sylvestre, E.A., 1986. Principal modes of variation for processes with continuous sample curves. Technometrics 28, 329–337.
- 5Chang and Mc Keague (2022) Chang, H.w., Mc Keague, I.W., 2022. Empirical likelihood-based inference for functional means with application to wearable device data. Journal of the Royal Statistical Society Series B: Statistical Methodology 84, 1947–1968.
- 6Chen et al. (2024) Chen, H., Müller, H.G., Rodovitis, V.G., Papadopoulos, N.T., Carey, J.R., 2024. Daily activity profiles over the lifespan of female medflies as biomarkers of aging and longevity. Aging Cell 23, e 14080.
- 7Chen et al. (2014) Chen, H., Reiss, P.T., Tarpey, T., 2014. Optimally weighted L 2 distance for functional data. Biometrics 70, 516–525.
- 8Chen and Lei (2015) Chen, K., Lei, J., 2015. Localized functional principal component analysis. Journal of the American Statistical Association 110, 1266–1275.
