Integrating dark diversity, functional traits, and diagnostic species: a framework to diagnose bottlenecks in forest recovery
Ming-Hui Wang, Jian-Rong Su, Wan-De Liu, Shuai-Feng Li, Xiao-Bo Huang, Jia-Yan Shen, Rui-Guang Shang

TL;DR
This paper introduces a new framework combining dark diversity, functional traits, and diagnostic species to assess and guide forest recovery in subtropical China.
Contribution
The study integrates dark diversity with functional traits and diagnostic species to diagnose recovery bottlenecks in forest ecosystems.
Findings
The Community Completeness Index (CCI) increased during recovery and stabilized as the forest approached a stable state.
Stage-specific early-warning species were identified, with their absence signaling successful or problematic recovery stages.
Functional traits like seed mass and mycorrhizal type were stronger predictors of species absence than environmental conditions.
Abstract
Accurately assessing the natural recovery processes of forest ecosystems remains a key challenge in restoration ecology. The concept of dark diversity—the set of species absent from a site but belonging to its habitat-specific species pool—provides a novel lens for this assessment. In this study, we developed and applied an integrated diagnostic framework that synthesizes dark diversity, functional traits, and diagnostic species. We applied this framework to a chronosequence of recovering forest ecosystems in subtropical China, representing early, middle, and late recovery stages. Our results demonstrated that the Community Completeness Index (CCI), derived from dark diversity, increased significantly during recovery, with its stabilization indicating the approach to a stable state. The framework identified stagespecific early-warning species: the absence of light-demanding,…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEcology and Vegetation Dynamics Studies · Forest Ecology and Biodiversity Studies · Mycorrhizal Fungi and Plant Interactions
Introduction
1
Forests represent one of the most important ecosystems on earth and play an irreplaceable role in biodiversity protection, climate regulation, and maintenance of carbon and water cycles (Brockerhoff et al., 2017; Chen et al., 2022). In recent decades, population growth and social development have disturbed and damaged the environment, leading to the degradation of forest ecosystems (Hosseini et al., 2024; Liu et al., 2024; Riedl et al., 2024). Forest degradation weakens the ecosystem’s service functions, thus affecting people’s living environment (Liu et al., 2024). Therefore, it is important to reconstruct and restore degraded forest ecosystems. In terms of current conditions, natural recovery and artificial restoration are the two most suitable methods (Qin et al., 2022; Zhang et al., 2024a). Among them, natural recovery mainly relies on natural forces to restore degraded forest ecosystems. Compared to costly and technically complex artificial restoration, it is usually more cost-effective and easier to apply on a large scale (Zhang et al., 2024a). Although natural recovery has great potential, it still faces many problems in practical application, such as the uncertainty of recovery effects and processes, and the difficulty in establishing dominant species during natural recovery (Zhao et al., 2023; Zhang et al., 2024a). Given this, most strategies for restoring forest ecosystems are primarily based on natural recovery, with artificial restoration used to assist natural recovery (Atkinson and Bonser, 2020).
Assessing the natural recovery potential of forest ecosystems is the first step in ecological restoration work (Gatica-Saavedra et al., 2017). Such assessment determines whether recovery objectives are being met, identifies facilitating or limiting factors, and informs decisions on the necessity, timing, and methods of artificial restoration. Traditional assessments often rely on observed diversity metrics, requiring a suitable reference ecosystem (usually pre-degradation) and potentially overlooking the full spectrum of potential biodiversity (Helm et al., 2015; Shackelford et al., 2013; Chollet et al., 2025). The concept of “dark diversity” - the set of species that are absent from a site but belong to its habitat-specific species pool under suitable conditions (Pärtel et al., 2011) offers a novel and more comprehensive perspective. Integrating dark diversity into restoration ecology can enhance the assessment of natural recovery potential, identify factors limiting species occurrence, and refine management priorities (Moeslund et al., 2017; Deschênes et al., 2024; Feng et al., 2025).
Hitherto, some studies have successfully applied the concept of dark diversity to restoration and conservation issues. First, the Community Completeness Index (CCI) based on dark diversity can be used to assess whether a natural ecosystem in recovery is approaching successful restoration (Cam et al., 2000; De Bello et al., 2012; Pärtel et al., 2013). Lower CCI indicate that the current ecosystem lacks a large number of suitable species and still requires further restoration, while higher CCI suggest that the ecosystem’s recovery has been relatively successful (Pärtel et al., 2013; Moeslund et al., 2017; Deschênes et al., 2024). Secondly, combine the set of species with high dark diversity and diagnostic species (species that indicate specific habitat types, ecological conditions, or specific characteristics of biological communities) to obtain a set of early warning species for the current ecosystem. By analyzing the characteristics of these early warning species, it can be inferred what potential problems may exist in the natural recovery process of ecosystems and provide reasonable restoration guidance (Caro and Girling, 2010; Nicod et al., 2019; Rodríguez-Rojo et al., 2020; Zanzottera et al., 2020). Finally, establishing the link between dark diversity and plant functional traits and environmental characteristics, identifying limiting factors in the natural recovery process of ecosystems, and assisting in the reconstruction of species in dark diversity (Violle et al., 2007; Hemrová and Münzbergová, 2015; Boussarie et al., 2018; Carlucci et al., 2020; Hostens et al., 2023; Aubin et al., 2024).
Despite these conceptual advancements, a critical knowledge gap remains: the lack of a unified, operational framework that synthesizes these diagnostic tools into actionable guidance for forest restoration. To address this gap, we integrate dark diversity theory into a comprehensive framework for assessing forest natural recovery (Supplementary Figure 1, which outlines the conceptual structure and operational stages of our integrated diagnostic tool). We applied this framework to forest ecosystems at different recovery stages in southern subtropical region of Yunnan Province, China. Pinus kesiya forests, as prevalent secondary successional communities with dynamic understory turnover, serve as an ideal system for investigating species re-establishment and the role of dark diversity in community development. Our specific research objectives are: (1) to assess the natural recovery of forest ecosystems using CCI without a reference community for recovery evaluation; (2) to identify key issues existing during the recovery process of forest ecosystems through early warning species; (3) to clarify the formation mechanisms of dark diversity and identify key factors hindering forest ecosystems at different recovery stages. By synthesizing these objectives, this study aims to translate complex ecological diagnostics into precise, stage-specific restoration strategies, thereby advancing the development of resilient and high-biodiversity forest ecosystems.
Materials and methods
2
Study region
2.1
This study was conducted in Simao District, Pu ‘er City, Yunnan Province (22°34′–22°53′N, 100°56′–100°09′E), China (Figure 1). This region is located within the boundary and transition zone between the tropical zone and the southern subtropical zone. The climate is tropical humid, with an average annual temperature of 21.7°C and an average annual precipitation of 1490 mm. Influenced by the southwestern Indian Ocean monsoon, rainfall is concentrated in summer and autumn, while winter and spring are relatively dry. The main soil type in the study region is acidic laterite soil (Shang et al., 2021). The main vegetation types in the study area are the Pinus kesiya forest, mixed coniferous and broadleaf forest, and monsoon evergreen broad-leaved forest. Most P. kesiya forests are secondary forests formed after human disturbance (e.g., logging and farming) of original monsoon evergreen broad-leaved forests. Through natural regeneration, these gradually develop into mixed coniferous and broadleaf forests. This well-documented successional trajectory for southwest Yunnan forms the empirical basis for our classification of forest recovery stages (Tang, 2010). The dominant tree species include Castanopsis hystrix, Castanopsis echidnocarpa, and Schima wallichii. The shrub layer consists mainly of tree saplings, and the herb layer is dominated by Scleria levis, Dicranopteris pedata, and some ferns (Tang, 2010).
Distribution diagram of the study region and sampling points.
Sample site setting and investigation
2.2
To investigate forest natural recovery across different stages, plots were established within three distinct forest types in Simao District, Pu’er City: Natural Secondary Pinus kesiya var. langbianensis Forest (PKF), Natural Mixed coniferous and broadleaf forest (MCB), and Natural Monsoon Evergreen Broadleaf Forest (MEB). These three forest types were selected to represent empirically recognized early, middle, and late stages of natural recovery, respectively (Tang, 2010). By evaluating tree rings and consulting historical documents, the ages of the three types of forest communities are approximately 40, 70, and 90 years old. The determination of the three forest types is based on an integrated assessment of canopy species and species with relative importance greater than 10% in the community. To ensure comparability and minimize confounding environmental factors, all 24 plots were situated within a contiguous geographical area characterized by consistent climatic conditions and the same acidic laterite soil type. Plots were specifically chosen to exhibit similar general topographical conditions and comparable historical land-use backgrounds within each successional category, specifically restricted to an altitude range of 1317–1648 m, slopes between 6° and 18°. Detailed topographic data for survey plots at different stages of forest restoration, including specific ranges of altitude and slope, are provided in Supplementary Table 1. Following the Center for Tropical Forest Science (CTFS) standards (Condit, 1995), eight 30 m×30 m plots were established for each forest type, with a minimum distance of 100 m between plots, totaling 24 plots. The sample size of eight replicates per stage was chosen to ensure adequate representation of intra-stage structural variability while maintaining logistical feasibility for high-intensity woody plant censuses, consistent with previous successional studies in this region (Shang et al., 2021). Geographic coordinates, altitude, slope, and aspect were recorded for each plot. Within each plot, all woody plants with a diameter at breast height (DBH) ≥ 1 cm were identified and counted, including lianas, shrubs, and trees. Based on the survey results, we calculated the forest density (FD), the ratio of tree number to plot area. We also used the allometric growth Equation 1 to calculate the total aboveground biomass (AGB) of woody plants with DBH ≥1 cm in each 30 m×30 m plot (Zhang et al., 2024b).
where AGB represents total aboveground biomass (kg), DBH is the diameter at breast height (cm), H is the actual measured height (m), and WD is stem wood density of species (g·cm^−3^).
During the community survey, soil samples (0–20 cm depth, below the litter layer) were collected using a stainless-steel soil corer. Five cores per plot were composited, sieved (2 mm mesh) to remove roots and stones, and analyzed indicators included soil pH, soil water content (WCOS), soil nutrients and enzyme activities. Soil nutrients included soil organic carbon (SOC), total nitrogen (TN), total phosphorus (TP), total potassium (TK), alkali-hydrolysable nitrogen (HN), available phosphorus (AP), and available potassium (AK). Enzyme activities assessed were urease (Ure), β-glucosidase (BG), β-1,4-N-acetylglucosaminidase (NAG), cellulase (CBH), and acid phosphatase (ACP). Analytical methods followed Shang et al. (2021, 2023) and Deforest (2009). The unit of Ure measurement is mg·g^-^¹ 24h^-^¹, while the units for other enzymes are nmol·g^-^¹ h^-^¹.
Nine plant functional traits related to the plant’s morphology, reproduction, dispersal, population attributes, resource strategy, and life history were compiled (Li and Prentice, 2024): mycorrhizal associations, leaf carbon content (LOC), leaf nitrogen content (LTN), leaf phosphorus content (LTP), leaf area (LA), specific leaf area (SLA), leaf dry matter content (LDMC), stem and wood density (WD), and seed mass (SM). Mycorrhizal types were determined based on the published literature and online datasets (http://mycorrhizas.info/index.html) (Soudzilovskaia et al., 2020). To ensure that trait values accurately reflect local environmental conditions and minimize the impact of regional-scale intraspecific variation, these quantitative functional traits (except seed mass) were directly measured within the study area (Pu’er region). For seed mass, data were primarily obtained from the Germplasm Bank of Wild Species (http://www.genobank.org); although these data came from different study plots, the sampling sites were also located within the Pu’er region. Since the number of sampling points for seed mass per species was typically small (ranging from 1 to 3), average values were used for calculations. Public databases, such as the TRY Plant Trait Database, LEDA Traitbase, and Seed Information Database (SID), were only utilized for a few rare species with limited local available records (Kleyer et al., 2008; Kattge et al., 2011; Liu et al., 2019).
Data analysis
2.3
Dark diversity and community completeness
2.3.1
We constructed a metacommunity using the combined species from all 24 plots across three recovery stages. Which each observation site was treated as a local community and the combination of all sites formed the metacommunity. The metacommunity matrix consisted of observation sites (rows) and species (columns), where the species are observed at least once in the metacommunity. Dark diversity was estimated from species co-occurrence patterns within each metacommunity dataset using R package ‘DarkDiv’ (De Bello et al., 2016; Carmona and Pärtel, 2021; Hostens et al., 2023; Paganeli et al., 2024). Specifically, we used the hypergeometric method available in the R package for estimation. This method uses the hypergeometric distribution to calculate the expected number of co-occurrences of species pairs under randomness (given by the mean of the hypergeometric distribution), standardizes the difference between this expected value and the observed co-occurrence frequency, and then averages and probabilizes it, ultimately obtaining a probability value that is independent of regional frequencies and represents the specific ecological suitability of each species at a particular location (Carmona and Pärtel, 2021). The summed suitability probability values of all missing species at a certain location are referred to as the site dark diversity, and the average probability values of each species being missing during the same recovery phase are referred to as the species dark diversity (Pärtel et al., 2013; Tang et al., 2023; Chollet et al., 2025). The CCI is the ratio of species richness to total site dark diversity, expressed in logarithmic form ln (observed diversity/dark diversity) (Pärtel et al., 2013; Tang et al., 2023). To evaluate the comparison of CCI with traditional recovery success indicators, we used ANOVA and Tukey’s post-hoc test to examine differences in observed diversity, dark diversity, and community completeness across different recovery stages. We also assessed the relationship between CCI and species richness using linear regression.
Identification of early warning species
2.3.2
First, we identified the diagnostic species for each stage. Instead of using traditional abundance-based IndVal, we used the Functional Association Index (ϕ) (Ricotta et al., 2020). We calculated the abundance-weighted Community Weighted Mean (CWM) for each stage to represent its functional center. Before distance calculation, all traits were log-transformed and standardized using Z-scores to ensure equal weighting. The functional similarity between each species and the functional center was calculated as 1−dnorm, where dnorm is the Euclidean distance in multidimensional trait space normalized by the maximum observed distance (Ricotta et al., 2020). The index ϕ was then calculated as the product of a species’ relative abundance in a stage and its functional similarity to that stage’s center. This approach ensures that diagnostic species are those that are both functionally compatible with and characteristic of a specific recovery stage. Significance was tested through 999 permutations (p< 0.05). Second, we defined the dark diversity set. Following Dalle Fratte et al. (2022), species with a dark diversity probability ≥ 0.6 were selected. This specific threshold was used to minimize the statistical uncertainty associated with a probability of 0.5, ensuring that the missing species have a clear ecological affinity for the site. Finally, the intersection of the diagnostic species set and the dark diversity set was defined as the “early-warning species set”. To ensure the robustness of our results, we verified the assumption that dark diversity is not biased toward diagnostic species through a permutation-based null model. Specifically, the null model was constructed by randomly shuffling species identities in the trait matrix 999 times while keeping the community’s functional center constant, thereby testing whether the observed association between species and specific recovery stages was significantly greater than expected by chance.
Dark diversity affinity
2.3.3
Considering that species and sites can independently form dark diversity patterns, directly associating dark diversity with environmental characteristics or functional traits may confuse their interactions. Here, we introduce a new index proposed by Fujinuma and Pärtel (2023), called dark diversity affinity (DDA). This index measures the tendency of species and sites to increase dark diversity. DDA can be quantified and decomposed into species dark diversity affinity (dda_sp_) and site dark diversity affinity (dda_site_), allowing them to be separately associated with environmental characteristics and functional traits.
The dataset required for this method includes species occurrence data, location-specific suitability (probability values of the specific ecological suitability of each species at a particular location obtained by the hypergeometric distribution method), environmental characteristics of the site, and functional traits of the species. The environmental characteristics of the site include stand density (SDI), the total aboveground biomass (AGB), Elevation, Slope, soil pH (PH), soil water content (WCOS), the first and second principal components (PC1 and PC2) were generated through principal component analysis (PCA) of environmental variables related to soil nutrients and enzyme activity (see Supplementary Figure 2). functional traits of the species include Mycorrhizal type, seed mass (SM), stem density (WD), leaf nitrogen content (LTN), leaf phosphorus content (LTP), leaf organic carbon (LOC), leaf dry matter content (LDMC), specific leaf area (SLA) and leaf area (LA).
First, a unified species-site model should be constructed, which consisted of three operational components:
(1) Using the logistic regression model, estimates of dda_sp_ and dda_site_ were combined to form a unified DDA metric, see Equation 2:
where represents the DDA of species i at location j.
(2) The estimated DDA was used to adjust the suitability of the site (suit) to predict the probability of species occurrence (p). The adjusted occurrence probability was calculated using the following Equation 3:
where denotes the probability of occurrence of species i at location j and is a constant for adjusting the predicted occurrence probability to the observed occurrence rate level. The Equation 4 for obtaining is as follows:
(3) The observed species presence/absence data (prab) were linked to the predicted occurrence probability (p) if it obeys the Bernoulli distribution, thus allowing the inference of parameters (Lemoine, 2019).
We then applied Bayesian method to estimate the parameters of the species-site unified model: a, b, dda and DDA. To address the separation problem in logistic regression, we used weakly informative priors. Specifically, we used the Cauchy distribution and constrained the prior distribution to the center to maintain the conservativeness of the inference. The scale parameters for the prior distributions follow Gelman et al. (2008). All numeric predictor variables were standardized before model fitting (mean adjusted to 0, standard deviation adjusted to 0.5). The scale parameter for the intercept parameter a was set to 0.5. The scale parameter for the regression coefficient parameter b was set to 2.5. Bayesian MCMC sampling (an iterative algorithm) was performed using the Gibbs sampler JAGS 4.3.0, with a total of three MCMC chains 4000 iterating to adjust the DDA parameters until convergence (convergence was achieved with the Gelman-Rubin statistic R-hat ≤ 1.1 (Gelman and Rubin, 1992)), When the difference between the results of the three chains was very small and tended to stabilize, with at least 333 post-burn-in posterior samples having been retained for each chain, convergence was deemed satisfactory). Refer to Fujinuma and Pärtel (2023) for technical details. To assess the sensitivity of our model to MCMC configurations, we performed a robustness check on a randomly selected representative subset of the data. We compared the results of the initial protocol-based settings (3 chains, 4,000 iterations) against a more conservative configuration (4 independent chains without seed resetting, 20,000 burn-in, and 20,000 sampling iterations). This step was taken to ensure that the chosen iteration length was sufficient to capture the posterior distributions accurately.
Results
3
Dark diversity, observed diversity and CCI change patterns at different recovery stages
3.1
The results showed (Figure 2; Supplementary Table 2) that the observed diversity significantly increased from PKF (25.4 ± 1.28) to MCB (35.2 ± 2.43) and from PKF to MEB (33.9 ± 1.94). However, the changes in community dark diversity from PKF (40.1 ± 0.61) to MCB (37.5 ± 1.43) and then to MEB (37.9 ± 1.30) were not significant (F = 1.405, p=0.268). This consistent high level of dark diversity across all stages suggested that despite the accumulation of observed species, a large and stable pool of potential colonizers remained excluded from the local community throughout the chronosequence. This also indicated that observed diversity alone was insufficient to assess natural recovery. CCI, as the ratio of observed diversity to dark diversity, was significantly correlated with species richness (Supplementary Figure 3) while also considering potential species in the recovery process. It rapidly increased significantly from PKF (-0.44 ± 0.06) to MCB (-0.07 ± 0.10) and then stabilizes at MEB (-0.12 ± 0.09), reflecting a transition in the recovery dynamics: an early phase of rapid stochastic reassembly followed by a later phase of more stringent, niche-based filtering. This change pattern allows us to successfully quantify the recovery process without the need for an external reference ecosystem.
ANOVA results of dark diversity, observed diversity and community completeness at different natural recovery stages. Among them, dark diversity refers to site dark diversity, and community completeness refers to CCI. The box plot shows 1.5 times the median (line in the middle of the box) and the interquartile distance (box) ± interquartile distance (whisker).
Early warning species in different recovery stages
3.2
A method based on species functional traits was used to detect diagnostic species and non-diagnostic species for three recovery stages. The results show that a different number of species had significant associations with the habitats of each recovery stage (p< 0.05). Among them, the number of diagnostic species for MEB (n=21) was the highest, while that for PKF (n=9) and MCB (n=7) was lower (Figure 3; Supplementary Table 3).
Relations between the standardized effect size of functional association (SES-ϕ) for diagnostic species based on functional traits (green triangles), and their average probability of dark diversity (circles) among each natural recovery stages ((a) PKF; (b) MCB; (c) MEB). The dark circles represent the species with a probability of dark diversity higher than 0.6.
Combining the diagnostic species with the dark diversity of the species calculated above, we identified early warning species for each stage. There were no significant difference in the size of dark diversity between diagnostic species and non-diagnostic species during each recovery stage (p > 0.05; Supplementary Figure 5), supporting the method’s feasibility. In PKF, the dark diversity of all diagnostic species ranged from 0 to 0.4, with no early warning species. In MCB and MEB, a small portion of diagnostic species had dark diversity within the range of 0 to 0.4, while most species (4 in MCB and 14 in MEB) exceeded the 0.6 threshold. These species were all considered early warning species for the recovery stage, some light-loving transition species in the MCB, such as Dalbergia yunnanensis, Quercus acutissima, Tarennoidea wallichii and Vaccinium mandarinorum, as well as the shade-tolerant climax species in MEB, such as Anneslea fragrans, Castanopsis echinocarpa and Lithocarpus fenestratus in MEB. In summary, the identification of early-warning species provides a stage-specific diagnostic tool, these species-specific patterns are critical for direct management.
The relationship between dark diversity with functional traits and environmental characteristics at different recovery stages
3.3
The methodological rigor of our Bayesian decomposition is further supported by a sensitivity analysis conducted on a data subset. The comparison revealed (Supplementary Figure 6) that extending the MCMC chains from 4,000 to 20,000 iterations yielded virtually identical posterior modes and parameter signs. This consistency demonstrates that the 4,000-iteration length adopted from the Fujinuma and Pärtel (2023) protocol is sufficient for the model to reach satisfactory convergence in our study context, and that the reported ecological patterns are robust to MCMC sampling depth.
Based on the specific indicators (early warning species) identified in different recovery stages, we further employed a DDA model to quantify the general assembly rules across the entire species pool. The posterior distributions of DDA model parameters are shown in Figure 4. Subsets where species were present generally had lower DDA than subsets where they were absent, confirming DDA as a valid proxy for dark diversity. The distribution of dda_sp_ (orange boxplot) shows a significant shift in position and a large variation. In contrast, the distribution of dda_site_ (blue boxplot) changes much less, with a less obvious shift in median and high overlap between boxplots. dda_sp_ contributes more to DDA than dda_site_, and DDA is mainly regulated by dda_sp_.
The estimated parameters of the three recovery stages ((a) PKF; (b) MCB; (c) MEB) are divided into four subsets based on site-specific suitabilities below/above 0.5 (left and right panels) and then by presence/absence (upper and lower panels). The distribution of estimated parameters in each subset is shown using box plots. The included model parameters are: p (presence likelihood), suit (suit value), DDA (uniform DDA), ddasp (species DDA), and ddasite (site DDA). δ represent the constant parameters of each metacommunity. Each estimate is the median value of 999 Bayesian posterior samples.
At the site level (dda_site_), environmental characteristics exhibited almost no significant correlation with DDA across all successional stages, with the sole exception of a negative correlation with slope during the MCB stage (Figure 5a). This striking lack of abiotic influence suggested that habitat-level environmental filters were not the primary drivers of species exclusion in this system. At the species level (dda_sp_), the DDA model revealed a progressive shift in the types of functional traits influencing dark diversity as recovery proceeded (Figure 5b): In PKF, DDA was primarily associated with regeneration and wood-structure traits. Specifically, species with larger seed mass (SM) and higher stem wood density (WD) showed a significantly higher propensity to belong to the dark diversity (i.e., they were more likely to be missing). Additionally, mycorrhizal type played a role, with AM species being more likely and EcM species being less likely to be in the dark diversity. In MCB, the influence of seed mass persisted, but leaf-based physiological traits began to emerge as significant factors. For instance, specific leaf area (SLA) exhibited a significant positive correlation with ddasp, while leaf phosphorus content (LTP) showed a negative correlation. In MEB, the influence of traits related to initial colonization (e.g., seed mass and wood density) completely disappeared. Instead, dark diversity affinity was exclusively determined by resource acquisition traits, including leaf dry matter content (LDMC), SLA, and leaf area. In summary, the model results indicate a clear successional turnover in the functional filters of dark diversity, shifting from an emphasis on seed and wood traits in early stages to leaf and resource-related traits in the late stage.
Posterior distributions of the logistic regression coefficients of ddasp and ddasite in the model at different natural recovery stages ((a) PKF; (b) MCB; (c) MEB). The left and right columns show the estimated parameters of the species and the site explanatory variables, respectively. The density distribution is indicated using colors based on the 95% confidence interval (0.025–0.975): black denotes significantly positive (higher propensity for dark diversity, functional traits are significantly positively correlated with functional diversity), white denotes significantly negative (lower propensity for dark diversity, functional traits are significantly negative correlated with functional diversity), and gray denotes insignificant correlation (functional traits are not significantly correlated with dark diversity). Before model matching, all numerical explanatory variables were normalized to mean = 0 and standard deviation = 0.5.
Discussion
4
Measuring the success of forest natural recovery based on dark diversity
4.1
Our analysis of CCI dynamics across recovery stages and the association between CCI and species richness reveals its utility as a recovery metric. Although CCI is significantly correlated with species richness, as a relative index, CCI reflects the proportion of the actual species pool to the ecological potential of the area, providing a more detailed and specific perspective on restoration integrity. As Deschênes et al. (2024) pointed out, although CCI is expected to show a correlation with species richness as species accumulate, it considers the dynamic nature of the potential species pool, while simple richness counts ignore this. CCI is lowest in the early stage (PKF), indicating that although environmental conditions may allow its existence, many suitable species have not yet colonized. Significant increases in CCI during mid-recovery (MCB) align with the reappearance of “hidden” species from seed banks or surrounding areas as habitat structure and resources improve (Wilson et al., 2011; Lewis et al., 2017; Dong et al., 2019). During the late recovery (MEB) stage, there was a non-significant slight decline in CCI, which may reflect a shift in the driving factors of community assembly. In late-successional forests, stronger biological filtering (such as niche occupation by dominant late-successional species or intensified light competition under closed canopies) may limit additional species entering from the potential pool (Zhai et al., 2017; Huang et al., 2024). This trend of niche stability may lead to a higher proportion of dark diversity in mature communities, as environmental filters become more restrictive (Riibak et al., 2024). Overall, these trends indicate a gradual and significant elevation of CCI throughout the natural recovery process of forests; when CCI reaches a state of non-significant fluctuation, it can be inferred that the forest natural recovery is nearing completion. A notable additional finding emerged from our study: the decrease in dark diversity from the early to the late stage of natural recovery did not match the increase in observed diversity. This discrepancy arises because ongoing environmental amelioration during natural recovery attracts a broader array of species from more extensive surrounding areas, thereby expanding the size of the local potential species pool (Riibak et al., 2024; Wan and Wang, 2025). Therefore, even if observed diversity increased significantly during the recovery process, the corresponding dark diversity would not decrease significantly. This further indicates that CCI calculated based on dark diversity is a more superior indicator of community integrity compared to that calculated based on observed diversity (observed species richness/total species pool) (Pärtel et al., 2013). However, it is important to note that establishing a reference baseline based on a homogeneous environment is also crucial when using CCI to infer the natural recovery state, in order to eliminate the influence of environmental background noise.
In forest ecosystem restoration projects, assessing the natural recovery status of forests often requires an undisturbed ecosystem as a reference (Deschênes et al., 2024). Where ideal reference sites are unavailable, the CCI can serve as a new indicator to measure the success of forest natural recovery. We conduct continuous monitoring starting at a certain time during forest natural recovery, calculating and comparing the CCI at each time point. When the CCI continues to increase, it indicates that the forest is undergoing natural recovery, and this is also the optimal time for artificial intervention to accelerate natural recovery. When the CCI stops increasing significantly after a period of growth, it usually means that the natural recovery of the community has reached its optimal state and the community is becoming stable. At this point, if the restorer still feels that the recovery has not met the expected goals, this can serve as a new trigger to initiate active adjustment measures in restoration management (Moeslund et al., 2017; Deschênes et al., 2024).
Identification of early warning species based on dark diversity
4.2
Early warning species are related to the functional centers of their habitats, but often occupy ecological marginal positions (Dalle Fratte et al., 2022). Therefore, they can highlight the current status and existing problems of ecosystem natural recovery to some extent. The method of identifying early warning species by combining diagnostic species and species dark diversity has been successfully tested in some habitat types, such as alpine habitats (Dalle Fratte et al., 2022). In this study, we applied this method to a structurally complex forest ecosystem. The results showed that there were no early warning species in the early stage of natural forest recovery. At this stage, diagnostic species are typically pioneer species with high dispersal ability and strong environmental tolerance. Once the environment becomes suitable, they quickly occupy habitats, resulting in a very low probability of species absence (dark diversity), hence no early warning species. In the middle stage of natural recovery, early warning species included Dalbergia yunnanensis, Quercus acutissima, Tarennoidea wallichii, and Vaccinium mandarinorum. These species belong to shrubs or small trees, have a high specific leaf area (SLA), and are resource-acquisition type plants that mainly grow in high-light environments. When they become early warning species, it means that the functional center of the current stage of habitat has exceeded the ecological niche of these species. This absence is not a sign of failure to recover, but evidence of niche displacement, confirming that the plant community is successfully evolving towards a more compact canopy and a stable state, with species belonging to this strategy being selected out. Moreover, some studies also indicate that these species are transitional species in the monsoon evergreen broad-leaved forests (Tang, 2015). In the late stage of natural recovery, early warning species include species such as Anneslea fragrans, Castanopsis echinocarpa and Lithocarpus fenestratus. These species have a lower specific leaf area, belong to resource conservative plants, and mainly grow under closed canopies and moist, shaded forests. As representatives of the top-level monsoon evergreen broad-leaved forest communities, their reduction may indicate that the forest is beginning to regress from a stable, complex mature stage to an unstable, structurally simple early stage (Tang, 2015).
Ecosystems will continue to be affected by human activities and environmental changes during the process of natural recovery. We need to monitor the structure and function of the current ecosystem and early changes in environmental conditions in real time to promptly identify problems that may arise during natural recovery, thus gaining a critical time window for implementing intervention measures (such as reducing grazing intensity, restoring disturbance mechanisms). The framework adopted here combines functional diagnostic species with dark diversity, providing a repeatable and standardized method for ecosystem assessment. On one hand, by replacing qualitative expert intuition with algorithmic processes, especially the combination of functional trait clustering and statistical occupancy probability (Beals index), the subjectivity of species identification is reduced; on the other hand, the introduction of dark diversity can transform the originally overlooked potential species into critical ecological information, and can better lock in diagnostic species that have disappeared due to changes in environmental conditions, thereby effectively compensating for the limitations of traditional observation methods in sampling bias and time lag, and achieving a sensitive capture and prospective assessment of ecosystem degradation risks and functional space gaps (Kovač et al., 2016; Rodríguez-Rojo et al., 2020; Delbosc et al., 2021). Early warning species are related to the functional core of their environment as part of indicator species. On one hand, we identify these species during the restoration process. By analyzing their ecological characteristics (such as their specific requirements for humidity, soil nutrients, and pollinators), we can precisely detect problems in the natural restoration process. For example, if all early warning species are shade-tolerant plants and they are disappearing, it indicates that the canopy layer is too dense during natural restoration, leading to a mismatch in light conditions. On the other hand, if early warning species are the dominant species in the naturally restored community, it suggests that the ecosystem may be facing degradation. Identifying them allows us to promptly detect potential issues before the system fully degrades back to its original state, thus deciding whether to transition from passive natural restoration to human intervention.
Identifying obstacles to natural forest restoration based on dark diversity
4.3
Understanding why some species that should exist are missing in certain regions (dark diversity driving factors) can clarify the ecological barriers to species establishment, such as, dispersed limitations, disturbances or competition, etc. Fujinuma and Pärtel (2023) used various assemblages (from plants to mammals) representing different regions (from Central Europe to New Zealand, from North America East to South America Central) as meta-communities for analysis, confirming the hypothesis that the mechanisms shaping dark diversity operate at both species and site levels. However, our study results showed that that site-level environmental characteristics—including, forest structure, and topography—showed almost no significant relationship with DDA (Figure 5a). This result suggests that within the studied landscape, environment constraints are not the primary bottlenecks for species re-establishment. Instead, our findings indicate that dispersal limitations, biotic interactions, or historical contingencies play a much more decisive role in shaping community assembly than environmental factors. On one hand, high abiotic homogeneity in our study sites likely shifts the filtering pressure to biotic interactions; on the other hand, PCA-based compression of environmental information may overlook the influence of specific, fine-scale variables (e.g., micro-nutrients or micro-climate). Nevertheless, the overall lack of distinct environmental signals strongly indicates the dominance of species-level functional filtering processes.
Our study considers time dynamics while analyzing the drivers of dark diversity, allowing us to understand the factors hindering recovery in different stages of natural restoration. The early stage of natural recovery is the phase where plant communities successfully spread and rapidly establish. The results show that during this stages, seed mass, mycorrhizal type, stem density, and some leaf functional traits are all related to the formation of dark diversity. Specifically, species with larger seed are more likely to form dark diversity. Seed size usually reflects both the species’ dispersal ability and establishment ability (Myers and Harms, 2009). Although larger seeds are beneficial for the successful establishment of seedlings, they have lower production and weaker dispersal ability, making it difficult for them to spread over large areas to reach suitable regions for population establishment (Cain et al., 2000; Moles and Westoby, 2004; Riibak et al., 2015; Moeslund et al., 2017). Compared to other mycorrhizal types, AM species are more likely to form dark diversity, while EcM species are opposite. Mycorrhizal type is a key factor in the success of plant establishment. AM species have lower drought resistance and tolerance than EcM species, so they are less likely to successfully establish in a certain area (Krauss et al., 2004; Hempel et al., 2013). The relationship between stem density and dark diversity may reflect the species’ ability to adapt to environmental stress and niche differentiation. Plants with higher stem density usually grow more slowly, making them more likely to form dark diversity in the early and mid stages of natural recovery (Hodáňová, 1981). In addition, species with higher nutrient content have certain advantages during the establishment stage and are less likely to become part of dark diversity (Cornelissen et al., 2003; Moeslund et al., 2017). It is worth noting that species capable of rapidly acquiring resources (large leaf area, low leaf dry matter content) are less likely to become part of the hidden diversity during the early stages of natural recovery. Species with these characteristics are more likely to colonize the early successional habitats after disturbance (Gelman and Rubin, 1992). In the middle to late stages of natural recovery, the focus is more on the persistence of species in a given habitat. Once a species has spread to a certain location and successfully established itself, its competitive ability against other plants and tolerance to environmental changes determine whether it can survive long-term in that habitat. In the middle and late stages of natural recovery, the forest canopy gradually closes, reducing light intensity and intensifying competition among plants. At this time, species with low SLA are “resource conservers,” and they are less likely to become part of the dark diversity (Deschênes et al., 2024).
In restoration projects, identifying specific restoration barriers helps restorers develop targeted interventions to alleviate these obstacles and improve the success of the restoration project, rather than blindly carrying out large-scale vegetation reconstruction (Moeslund et al., 2017; Deschênes et al., 2024). Based on the results, we mainly propose the following general recommendations: (1) Dispersal limitation is a key factor for most species to become part of the habitat’s hidden diversity, and promoting species return through artificial assistance in dispersal can enhance biodiversity, provided that the introduced species conform to the natural distribution patterns of the local ecosystem; (2) Mycorrhizal fungi are important for the establishment and long-term survival of plant species, and inoculating certain mycorrhizal fungi can play a certain role in the successful restoration of plant communities; (3) A plant’s ability to acquire resources to some extent determines its competitive ability within the community. To ensure the successful establishment and survival of important species in resource-scarce habitats, it is necessary to artificially create suitable conditions based on the characteristics of the plants themselves. In conclusion, forest management should also focus on facilitating migration, creating dispersal corridors, or protecting animal dispersers to promote the arrival of these “functionally suitable but absent” species, rather than solely focusing on abiotic habitat modification.
Limitations of assessing forest natural recovery based on dark diversity
4.4
Our research integrates dark diversity, functional traits, and diagnostic organisms, providing a powerful tool for assessing the natural recovery of forest ecosystems. However, it must be acknowledged that some limitations exist to give the research findings contextual significance, which can be categorized into study-specific design limitations and the inherent methodological challenges of dark diversity methods. Regarding the study execution, the use of a chronosequence (space-for-time substitution) approach carries inherent assumptions. While we selected sites with similar parent materials and climates, the potential for site-specific history and environmental heterogeneity to influence successional trajectories cannot be entirely ruled out. Furthermore, our analysis relied on species-averaged trait values obtained from databases or literature, thereby ignoring trait variation. Given that species can exhibit significant plastic responses to local environmental changes during forest recovery, Ignore traits variations may lead to an underestimation of the adaptive capacity of certain species. Beyond these design constraints, the conceptual application of dark diversity presents inherent methodological challenges, particularly in the definition of the habitat species pool. The accuracy of dark diversity estimations depends heavily on the quality and scope of the regional species pool. As Deschênes et al. (2024) pointed out, choosing a reference target is increasingly difficult under global change, where historical baselines may no longer be realistic. Furthermore, using probability values to quantify dark diversity at the species level has certain logical ambiguities. For example, a species may have a lower dark diversity because it is a common species already present in the sample plot, or because it is a rare species not accepted by the environment. This can lead to overlap of species in the probability distribution. In conclusion, while these limitations exist, our framework serves as a proactive diagnostic tool. Future research should aim to incorporate field trait measurements and long-term monitoring data, and further study should be conducted to refine the probabilistic dark diversity method for practical application.
Conclusion
5
This study developed and applied an integrated dark diversity framework to assess forest natural recovery across different stages. We demonstrated that the Community Completeness Index (CCI) considers the dynamic characteristics of the potential species pool, providing a more comprehensive and detailed measure of recovery success. Early warning species for different recovery stages were identified through the cross recognition of diagnostic species and hidden diversity, allowing us to predict potential issues in the natural recovery process and intervene in a timely manner. The main obstacles to natural recovery shift from dispersal and establishment limitations in early/mid stages (related to seed mass, mycorrhizal types, and wood density) to competitive interactions in later stages (related to resource acquisition traits). Crucially, we found that species traits were a stronger driver of absence than local environment conditions, highlighting dispersal and biotic interactions as primary restoration targets. While based on a single regional chronosequence, our framework offers a replicable, mechanism-based approach to diagnose recovery barriers. For practitioners, it argues for precision restoration—shifting from broad-scale planting to targeted strategies like assisted dispersal or mycorrhizal inoculation in early stages, and canopy management in later stages. Future work should validate this framework across diverse ecosystems and integrate it with long-term monitoring to test the efficacy of interventions designed to reduce dark diversity.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Atkinson J. Bonser S. P. (2020). Active” and “passive” ecological restoration strategies in meta-analysis. Restor. Ecol. 28, 1032–1035. doi: 10.1111/rec.13229, PMID: 41778641 · doi ↗
- 2Aubin I. DeschênesÉ. Santala K. R. Emilson E. J. S. Schoonmaker A. L. Mc Intosh A. C. S. . (2024). Restoring forest ecosystem services through trait-based ecology. Environ. Rev. 32, 498–524. doi: 10.1139/er-2023-0130, PMID: 36563491 · doi ↗
- 3Boussarie G. Bakker J. Wangensteen O. S. Mariani S. Bonnin L. Juhel J.-B. . (2018). Environmental DNA illuminates the dark diversity of sharks. Sci. Adv. 4, eaap 9661. doi: 10.1126/sciadv.aap 9661, PMID: 29732403 PMC 5931749 · doi ↗ · pubmed ↗
- 4Brockerhoff E. G. Barbaro L. Castagneyrol B. Forrester D. I. Gardiner B. González-Olabarria J. R. . (2017). Forest biodiversity, ecosystem functioning and the provision of ecosystem services. Biodivers. Conserv. 26, 3005–3035. doi: 10.1007/s 10531-017-1453-2, PMID: 41776007 · doi ↗
- 5Cain M. L. Milligan B. G. Strand A. E. (2000). Long-distance seed dispersal in plant populations. Am. J. Bot. 87, 1217–1227. doi: 10.2307/2656714 10991892 · doi ↗ · pubmed ↗
- 6Cam E. Nichols J. D. Sauer J. R. Hines J. E. Flather C. H. (2000). Relative species richness and community completeness: birds and urbanization in the mid-atlantic states. Ecol. Appl. 10, 1196–1210. doi: 10.1890/1051-0761(2000)010[1196:RSRACC]2.0.CO;2 · doi ↗
- 7Carlucci M. B. Brancalion P. H. S. Rodrigues R. R. Loyola R. Cianciaruso M. V. (2020). Functional traits and ecosystem services in ecological restoration. Restor. Ecol. 28, 1372–1383. doi: 10.1111/rec.13279, PMID: 41778641 · doi ↗
- 8Carmona C. P. Pärtel M. (2021). Estimating probabilistic site-specific species pools and dark diversity from co-occurrence data. Glob. Ecol. Biogeogr. 30, 316–326. doi: 10.1111/geb.13203, PMID: 41778641 · doi ↗
