The Main Factors Affecting the Minimum Sampling Area Determination Method: Based on Research of the Shrub Layer in Island Pinus massoniana Forests
Jihong Xiao, Qingyan Wen, Zhifei Zhong, Yanqiu Xie, Yingxue Wang, Xing Cai, Yuchen Lin, Feifan Weng, Guochang Ding, Chuanyuan Deng

TL;DR
This study identifies factors affecting the minimum sampling area for shrub layer biodiversity in Pinus massoniana forests on islands.
Contribution
The study introduces a method for determining the minimum sampling area using species richness and evenness in island shrub ecosystems.
Findings
The power function model best fits the species-area relationship in the study.
Species richness and evenness significantly influence the minimum sampling area.
A 142 m² sampling area achieves 70% accuracy in biodiversity assessments.
Abstract
What factors influence minimum sampling area size? We addressed this question through investigations of shrub layers in island Pinus massoniana forests. The research shows:, the power function model was identified as the best fit for the species-area relationship. Species richness and species distribution evenness were found to be the main factors affecting the determination of the minimum sampling area. Repeated sampling from four corners of the plot proved advantageous. Considering cost and accuracy, a 142 m2 minimum sampling area could achieve 70% sampling accuracy. The research provides a method for analyzing the minimum sampling area and valuable insights for biodiversity studies in island ecosystems and similar forest communities worldwide. Determining the minimum sampling area is crucial for accurate and cost-effective biodiversity surveys. This study identifies the main factors…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4- —Special Fund Project for Scientific Research of Marine Public Welfare Industry
- —Science and Technology Project Plan for Regional Development of Fujian
- —Special Fund Project for Science and Technology Innovation of Fujian Agriculture and Forestry University
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEcology and Vegetation Dynamics Studies · Forest Ecology and Biodiversity Studies · Rangeland and Wildlife Management
1. Introduction
As the sampling area increases, the number of species within the sample plot also increases. Initially, the increase is rapid, but it gradually slows down, forming a curve known as the species–area curve [1]. The main purpose of studying the species–area relationship is to determine the minimum sampling area of a community, which refers to the smallest area that can reflect the species composition and structural characteristics of the community [2]. Confirming the minimum sampling area can reduce expensive survey costs, which is of great significance for ecological research.
The determination of the minimum sampling area is influenced by various factors, with species richness being a key consideration. Previous studies have highlighted that species richness can significantly impact the results of the minimum sampling area determination. For instance, some researchers argue that higher species richness may not always lead to an increase in the minimum sampling area, especially when comparing communities with similar species richness levels [3,4]. This raises the following question: (1) How does species richness specifically affect the minimum sampling area? Methods for determining the minimum sampling area can generally be categorized into two approaches: the precision method and the inflection point method. The precision method typically defines the minimum area based on the proportion of species within a community. For example, some studies use a threshold where a specific percentage (e.g., 70.4% [5]) of species with an occurrence frequency greater than 1 is considered sufficient for sampling precision. Similarly, the Braun-Blanquet School often adopts a standard where the minimum area includes 80% of the total species in the community [2]. Other researchers have proposed that the minimum area should encompass more than 90% of the community species [2]. In contrast, the inflection point method relies on the slope of the species–area curve to determine the minimum area. For instance, some studies identify the minimum sampling area as the point where the slope of the curve stabilizes [6], while others define it as the inflection point where the slope increases by 10% and the number of species increases by no more than 5% [7]. This leads to the second question: (2) What are the differences between the minimum sampling areas obtained by the precision method and the inflection point method? Given the variability in methods and standards, it is essential to identify the main factors influencing the minimum sampling area determined by these two approaches. This raises the third question: (3) What are the primary factors affecting the minimum sampling area obtained by the precision method and the inflection point method?
Due to their unique geographical location, islands are typically considered fragile ecosystems. Many islands have employed strategies like afforestation to mitigate ecological vulnerability [8]. However, plantations on islands often consist of monocultures, leading to simple community structures with limited resistance to pests, diseases, and natural disasters and a relative lack of understory vegetation, other ground cover plants, and microorganisms [8]. P. massoniana, known for its strong adaptability and drought tolerance, serves as an important species for restoring degraded lands [9]. In the Sandu Gulf area of Ningde, P. massoniana forests, especially on local islands, are widely distributed, occupying the largest forest area and representing the most characteristic forest type with irreplaceable ecological value [10]. These forests are semi-natural communities with a rich understory species composition [10]. However, the herb layer of local P. massoniana communities is significantly affected by the allelopathic effects of Dicranopteris pedata, resulting in very few species, which makes the shrub layer the most stable stratum within the community [10,11,12]. Shrub layer plants play a crucial role in maintaining the ecological balance and stability of the local ecosystem, underscoring their significant ecological importance [10,11,12]. Therefore, conducting research on the shrub layer of semi-natural P. massoniana forests, which are relatively species-rich, can help address the shortcomings of understory plant scarcity and monotonous community structure in island plantations [8]. Shrub plants serve as a crucial link between the tree layer and the herb layer, playing significant roles in substitution and connection [13]. Numerous studies have shown that the minimum sampling area varies significantly across different communities, and the shrub and herb layers are affected by variations in tree layer characteristics. Consequently, researchers often divide a community into three layers—tree, shrub, and herb (or more)—for separate analysis [14,15,16,17].
In this study, we focus on the shrub layer species of P. massoniana communities, utilizing the nested plot method suggested by Liu et al. [18] for the species–area curve survey. We apply three commonly used fitting models—logarithmic, power, and logistic functions—to fit the species–area curves [19,20,21,22,23]. The Akaike Information Criterion (AIC) is used to test the goodness-of-fit [24] to identify the optimal equation. We explore differences in the minimum sampling area determined by the optimal equation using the precision and inflection point methods and analyze the main factors influencing the results of these methods. This approach aims to address scientific questions (1), (2), and (3) and to scientifically establish the minimum sampling area for studying the diversity of the shrub layer in P. massoniana communities on Sandu Island, Ningde, improve the accuracy of research results, and reduce survey costs, providing a reference for future related studies on minimum sampling area.
2. Materials and Methods
2.1. Research Area Overview
The Sandu Gulf is located in the southeast of Ningde City, Fujian Province, at the midpoint of China’s 18,400 km “Golden Coastline”, about 30 km from the Ningde urban area. It is a world-class natural deep-water harbor. The area comprises 126 islands, 17 of which are inhabited. The largest is Sandu Island, covering approximately 27.74 km^2^ and serving as the seat of the Sandu Town government. The study area is characterized by a typical subtropical maritime monsoon climate with hilly terrain. The primary soil types are red soil and yellow soil. The island is predominantly covered by secondary P. massoniana coniferous and broad-leaved mixed forest [10].
2.2. Community Survey
Based on preliminary field inspections, a typical plot method was employed from June to July 2022 to establish sample plots on Sandu Island. Eight forest plots, each 20 m × 20 m, were set up with P. massoniana as the dominant species in the tree layer (Figure 1). The basic information for the plots is provided in Appendix A. The selection criteria for the plots included consideration of varying slopes, positions, and elevations, minimal disturbance, moderate community canopy density, a buffer zone of over 30 m, and representativeness of the community. Additionally, the selected plots were mature stands protected by local policies, with no human management for many years, ensuring they represented semi-natural communities. Following the study by Ren et al. [25], an adjacent grid method was used to divide each 20 m × 20 m plot into sixteen 5 m × 5 m subplots (Figure 2a). Each plot’s first, fourth, thirteenth, and sixteenth subplots were further subdivided into smaller plots of 0.5 m × 0.5 m, 1 m × 1 m, 2 m × 2 m, 3 m × 3 m, 4 m × 4 m, and 5 m × 5 m (Figure 2b). Using a nested sampling method, the 5 m × 5 m subplots were then combined into plots of 5 m × 10 m, 10 m × 10 m, 10 m × 20 m, and 20 m × 20 m (Figure 2c). This configuration resulted in 10 area gradients ranging from a minimum of 0.25 m^2^ to a maximum of 400 m^2^. The species present in the shrub layer were recorded for the 16 5 m × 5 m subplots and the 6 area gradients within subplots 1, 4, 13, and 16. To minimize interference from species distribution variations, area expansion began from the four corners, specifically from the 0.25 m^2^ of subplots 1, 4, 13, and 16. Each corner was expanded to 25 m^2^ and 100 m^2^, with datasets being further expanded in two consistent directions, resulting in eight groups of different plot data per site and a total of 64 groups of 20 m × 20 m plot data. Specific data can be found in Appendix B.
2.3. Species–Area Relationship Models and Minimum Sampling Area Equations
Three commonly used species–area relationship models—logarithmic, power function, and logistic models [19,20,21,22,23]—were selected for functional model fitting and related calculations in nested sampling methods (Table 1).
2.4. Evaluation of Fit for Species–Area Relationship Models and Determination of Minimum Sampling Area
The least squares method was used to test the logarithmic model, while the Gauss–Newton algorithm was applied to test the power function and logistic models [20]. The value of R^2^ alone cannot accurately determine the goodness-of-fit; further testing is required [26]. Therefore, the goodness-of-fit for the three models was evaluated using the Akaike Information Criterion (AIC) [24]. A smaller AIC value indicates a better model fit and more accurate predictions.
In this study, two methods, namely the precision method and the inflection point method, were used to analyze the minimum sampling area. Based on the optimal model, seven different estimation accuracies (50%, 60%, 70%, 75%, 80%, 85%, 90%, and 95%) were selected to calculate the minimum sampling area [26,27]. At the same time, the inflection point method was used to calculate the minimum sampling area, and the slope of the curve of <0.1 was taken as the standard for identifying the inflection point [28,29], upon which the minimum sampling area was determined. The specific calculation steps of the inflection point are as follows: the first derivative of the optimal fitting model was calculated, the minimum sampling area A with different precision values was approximated to the first derivative, and the slope of the curve corresponding to the area was then used to find the inflection point.
For data analysis in this study, the c value in the logistic function was obtained by using Mathematica 12.2 software [30], and the AIC value was calculated by using “MuMIn” package in R 4.1.2 software program; all other data analysis and mapping were performed by using SPSS 26, Excel 2019, etc.
3. Results
3.1. Species–Area Relationship in the Shrub Layer
The mean values from 8 quadrats across 8 sampling plots on Sandu Island, totaling 64 quadrats, alongside 9 groups of field survey data (Table 1), were used for curve-fitting using functions (1) to (3). Each parameter for the related species–area curve equations was obtained (Table 2). According to R^2^ values greater than 0.8 and p-values of 0.000, the species–area relationships for the three target plants across three types of island forest communities aligned with the three fitting functional models. Among the nine data groups, the power function exhibited the highest R^2^. The goodness-of-fit was further assessed using the Akaike Information Criterion (AIC), identifying the best species–area curve from the nine data groups. The power function model had the lowest AIC value, indicating the best fit, followed by the logistic function model, while the logarithmic model had the poorest fit. These results were consistent with the R^2^ findings. Consequently, the optimal fitting models for the species–area relationship among the nine data groups were all power function models, expressed as: SS1 = 4.976A^0.348^, SS2 = 2.438A^0.412^, SS3 = 4.353A^0.448^, SS4 = 4.352A^0.411^, SS5 = 3.965A^0.401^, SS6 = 3.956A^0.428^, SS7 = 4.195A^0.415^, SS8 = 4.043A^0.37^, and SAll = 4.053A^0.404^.
The overall mean value of the dataset was fitted using the optimal power function model curve (Figure 3). As shown in Figure 3 and Appendix B, at a sampling area of 200 m^2^, the average number of species was 34.09, constituting 79.5% of the total species count. When the sampling area increased to 400 m^2^, the average number of species rose to 42.88, only 20.5% of the total species. This indicates that, as the sampling area expands, the number of plant species in the sample also increases, initially rapidly and then more gradually. The fitted values closely matched the observed values, indicating a successful curve-fitting process. Thus, the species–area curve for the shrub layer in the P. massoniana community on Sandu Island was determined as S = 4.053A^0.404^.
3.2. Minimum Sampling Area in the Shrub Layer
The equation for the minimum sampling area (5) based on the power function model (2) with the best goodness-of-fit test was selected to determine the minimum sampling area for the species diversity survey (Table 1). The equations for the minimum sampling area of nine groups of data are as follows: AS1 = (38ρ/4.976)^1/0.348^, AS2 = (27ρ/2.438)^1/0.412^, AS3 = (52ρ/4.353)^1/0.448^, AS4 = (52ρ/4.352)^1/0.411^, AS5 = (41ρ/3.965)^1/0.401^, AS6 = (50ρ/3.956)^1/0.428^, AS7 = (42ρ/4.195)^1/0.415^, AS8 = (41ρ/4.043)^1/0.37^, and AAll = (42.88ρ/4.053)^1/0.404^.
Seven different estimation accuracies of 50%, 60%, 70%, 75%, 80%, 85%, 90%, and 95% were selected to determine the minimum sampling area of nine groups of data (Table 3). The results showed that, with the increase in the estimation accuracy, the minimum sampling area of nine groups of shrub layer data for the species diversity survey gradually expanded, and the corresponding slope of the species–area curve gradually decreased. According to the inflection point method, to ensure accuracy within the ranges of 50% to 95%, when the accuracy of the sampling plot S1 was within the range of 60% to 70%, the slope of the curve changed from 0.1 to 0.08, while when the accuracy was 70%, the slope of the curve was 0.08 < 0.1, and the minimum sampling area for shrub survey in the sampling plot S1 was 79.4 m^2^123.6 m^2^. When the sampling accuracy was 50%, the curve slope was steep, and the minimum sampling area was 47 m^2^, which is inconsistent with the actual situation. Similarly, when the accuracy ranged from 50% to 95%, the minimum sampling area or the sampling interval of S2S8 and all sampling plots had the values of 63.7 m^2^, 200.6 m^2^~226.3 m^2^, 120.6 m^2^~175.5 m^2^, 94.8 m^2^~139.2 m^2^, 113.7 m^2^~163 m^2^, 128.8 m^2^~150.4 m^2^, 80.5 m^2^, and 97 m^2^~142 m^2^, respectively.
As presented in Table 4, using only the inflection point method without considering accuracy—specifically when the slope of the fitting curve was less than 0.1—the minimum sampling areas for plots S1 to S8, and the overall plots, were 79.4 m^2^, 50.6 m^2^, 217.4 m^2^, 133.9 m^2^, 101.4 m^2^, 140.7 m^2^, 132.2 m^2^, 73.3 m^2^, and 109 m^2^, respectively. During this analysis, the estimated number of species according to the fitting curve closely matched the actual observed species count, indicating a high degree of curve-fitting accuracy and more precise determination of the minimum sampling area.
Using the data from this study (Table 4) and the research on the minimum sampling area of the shrub layer in Acacia confusa, Eucalyptus citriodora, and Celtis sinensis communities on Langqi Island, Fujian, China [24], which employed the same plot setup and sampling methods as this study, a regression analysis was conducted between the minimum sampling area obtained by the inflection point method and the total number of species in the plots. The results are shown in Figure 4. The results indicate a highly significant positive correlation (p < 0.001) between the total number of species in the plots and the minimum area obtained by the inflection point method.
4. Discussion
Many scholars have noted that species–area curves are dependent on plot scale, with medium-scale plots fitting the power function model more accurately [31]. In this study, the fitting results for all plots showed that the power function was optimal, likely because the plot sizes were of medium scale [26]. As illustrated in Figure 3, the power function curve initially increases rapidly and then slows down, with the corresponding curve slope starting large and then gradually decreasing. The minimum sampling area analysis at different precision levels (Table 3) indicates that, as estimation accuracy increases, the required sampling area also increases, and the slope of the species–area curve gradually decreases. These results align with the typical behavior of species–area curves: as the sampling area expands, the number of species increases rapidly at first, then more slowly [32,33]. This suggests that the power function model indeed fits well at medium scales.
Keeley [3] suggested that different species abundance distributions might lead to varying species–area curve outcomes, subsequently affecting the determination of minimum sampling area. Xiao et al. [26] argued that species–area curves, which only account for species richness, may overemphasize the role of rare species, leading to an increase in the calculated minimum sampling area. This study, through an analysis of shrub layer plants in different island communities—P. massoniana (Table 4), A. confusa, E. citriodora, and C. sinensis [26]—using the same sampling method, shows (Table 4 and Figure 4) that the minimum sampling area increases with the total number of species in a plot, indicating a positive correlation (p < 0.001) between these variables. It is evident that species richness influences the minimum sampling area across different study regions and communities. Thus, when calculating the minimum area, including rare species increases the species count and thus the minimum sampling area determined by the inflection point method. Conversely, excluding rare species reduces the minimum sampling area. However, this is not absolute. For instance, plot S6 (50 species) has a larger minimum sampling area than plot S4 (52 species), and plot S3 (52 species) has a larger area than S4 (52 species), mainly due to uneven species distribution within the plots [4,34]. From Appendix B, it is evident that plots S3 and S6 exhibit higher species counts at sampling areas of 200 m^2^, 100 m^2^, and 50 m^2^ compared to plot S4, resulting in a larger opening of their power function fitting curves relative to the x-axis, thereby increasing the slope and the minimum area determined by the inflection point method [21]. In contrast, minimum areas determined by accuracy criteria differ. At the same precision levels (75–95%), plots S3 and S7 show the smallest minimum sampling areas. This is because, at sampling areas of 100 m^2^ to 200 m^2^, these plots already encompass a large proportion of the total species count, resulting in a smaller growth rate in species count when the area doubles to 400 m^2^. The accuracy method calculates minimum sampling area based on species proportion [2]; thus, under identical precision conditions, plots S3 and S7 have relatively smaller minimum areas, with similar conclusions applicable to other sampling areas and precision levels. It is worth mentioning that, in our previous research [11], we found that the species richness of the shrub layer was significantly influenced by soil pH, organic matter, and wind speed of the Pinus massoniana community on Sandu Island. The higher the soil pH and wind speed, the greater the species richness in the island community. On islands, soil stability primarily depends on soil organic matter [11]. A lack of soil organic matter can impair soil nutrients, disrupt island habitats, and reduce species richness, ultimately destabilizing the community. Additionally, factors such as distance from the coastline and slope aspect can influence the levels of pH and organic matter, thereby leading to variations in species richness [10]. Therefore, when setting up the sample plots in this study, these factors were taken into account, resulting in significant differences in the total number of species among different plots (Table 4).
Appendix B also indicates that, within the same plot, using a nested sampling method from different sampling points results in varying species counts for the same sampling area. This is particularly evident in plots with uneven species distribution [4], such as plot S5, where the species count variance at 25 m^2^ reached 13. This finding aligns with the results of large-scale studies by He et al. [35], who analyzed plots ranging from 25 to 60 ha by varying sampling directions and found that topographic heterogeneity and species variability significantly influence the outcomes of minimum sampling area determinations. Therefore, when conducting species–area relationship studies using the nested sampling method, it is insufficient to randomly sample from just one corner of the plot. Instead, repeated sampling from all four corners is recommended to effectively minimize bias caused by uneven species distribution.
From the accuracy method perspective, the minimum sampling areas required for precisions of 50%, 60%, 70%, 75%, 80%, 85%, 90%, and 95% are 61.8 m^2^, 97 m^2^, 142 m^2^, 168.5 m^2^, 197.7 m^2^, 229.7 m^2^, 264.6 m^2^, and 302.5 m^2^, respectively, corresponding to 3, 4, 6, 7, 8, 10, 11, and 13 shrub plots of 5 m × 5 m each. According to the inflection point method alone, the minimum sampling area is 109 m^2^ or five shrub plots of 5 m × 5 m. By combining both the accuracy and inflection point methods, the optimal minimum sampling area should be set at 142 m^2^, equivalent to six shrub plots of 5 m × 5 m, achieving 70% sampling accuracy. Given the study area is an inland island with richer species diversity than offshore islands, this minimum area can be widely applied to P. massoniana forests on islands.
5. Conclusions
This study highlights the importance of species richness, distribution uniformity, and sampling methods in determining the minimum sampling area for the shrub layer of the P. massoniana community on Sandu Island. The power function model (S = 4.053A^0.404^) was identified as the optimal model for medium-scale species–area relationships, while repeated sampling from the four corners of plots effectively minimized bias. For future research, extending the analysis to smaller and larger scales is recommended to compare differences in species–area relationships. Additionally, while current studies on shrub layer plants in inland forest communities often use 5 m × 5 m plots, larger-scale sampling should be adopted in regions with high biodiversity to ensure comprehensive and accurate assessments. These findings provide a methodological framework for biodiversity studies in island ecosystems and contribute to international research on similar forest communities.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Zhang J.T. Quantitative Ecology 2nd ed.Science Press Beijing, China 20115157
- 2Song Y.C. Vegetation Ecology 2nd ed.Higher Education Press Beijing, China 2017444573–84
- 3Keeley J.E. Relating species abundance distributions to species area curves in two Mediterranean-type shrublands Divers. Distrib.2003925325910.1046/j.1472-4642.2003.00017.x · doi ↗
- 4Yang Z.S. Li Y.X. Liu W. Wu T.D. Wang Z.G. Fitting of species-area relationship and determination of minimum sampling area for plant communities on barren hillsides of arid valleys in the upper reaches of the Minjiang River J. Ecol. Rural Environ.20102622723010.3969/j.issn.1673-4831.2010.03.007 · doi ↗
- 5Liu Z.L. Sun X. Gao J.L. Ye X.H. Determination of minimum sampling area for Artemisia ordosica community in Ordos Plateau, Inner Mongolia Res. Soil Water Conserv.201320185–189+195
- 6Jiang J. Zhang C.Y. Zhao X.H. Plant species-area relationship in a 42 hm 2 research plot of coniferous and board-leaved mixed forest in Jiaohe, Jilin Province, China Chin. J. Plant Ecol.201236303810.3724/SP.J.1258.2012.00030 · doi ↗
- 7Yang L.M. Han M. Zhao S.C. Studies on sampling intensity of plant diversity of grassland communities in the Songnen Plain of China J. Jilin Agric. Univ.199618333610.1007/BF 02951625 · doi ↗
- 8Wang X.L. Wang A. Shi H.H. Peng S.T. Gong L.P. Qin X.B. Carbon storage of Pinus thunbergii and Robinia pseudoacacia plantations on Nanchangshan Island, Changdao County of Shandong Province, China Chin. J. Appl. Ecol.2013241263126810.13287/j.1001-9332.2013.028724015542 · doi ↗ · pubmed ↗
