Rapid Identification of Candidate SNPs and QTLs for Capsicum annuum Chili Fruit Size and Capsaicin Content Using ddRAD-Sequencing and Bulk Segregant Analysis
Misbah Naseem, Adrian Christopher Brennan, Rashid Mehmood Rana, Christophe Patterson, Waqas Iqbal

TL;DR
This study identifies genetic markers linked to chili fruit size and pungency using sequencing and genetic analysis, offering targets for breeding.
Contribution
The study combines ddRAD-sequencing and bulk segregant analysis to identify SNPs and QTLs influencing multiple chili traits.
Findings
BSA identified SNPs associated with pungency, fruit length, and fruit weight.
Genetic mapping revealed overlapping QTL regions on chromosome 6 influencing multiple traits.
The findings suggest potential pleiotropy and provide targets for multi-trait breeding.
Abstract
Fruit size and pungency are key yield and quality traits in chili. This study combines high-throughput genotyping with bulk segregant analysis (BSA) to identify candidate SNPs and quantitative trait loci (QTLs) by analyzing extreme phenotypes from a Ghotki × Chakwal-4 F2 population. The traits were fruit length, diameter, length-to-diameter ratio, and weight, along with capsaicin content. Significant correlations were observed among length, diameter, and length-to-diameter ratio. A total of 534 single nucleotide polymorphisms (SNP) markers were used to develop genetic maps from 4315 to 6607 cM long. The SNP frequency data was pooled for the 25% of individuals showing extreme values for each measured trait, and bulk segregant analysis (BSA) was performed. BSA identified high-scoring SNPs associated with pungency (SNP 1_41308232; SNP 12_104377148), fruit length (SNP 1_92509300; SNP…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4- —Higher Education Commission of Pakistan
- —King Saud University, Riyadh, Saudi Arabia
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIon Channels and Receptors · Genetic Mapping and Diversity in Plants and Animals · Advances in Cucurbitaceae Research
1. Introduction
Chili (Capsicum sp.), a prominent tropical crop, is cultivated worldwide and consumed as both a fresh vegetable and a dried spice [1]. Chili exhibits a broad spectrum of morphological and physiological attributes encompassing color, size, shape, and bioactive compounds such as capsaicin, antioxidants, and ascorbic acids. This inherent diversity contributes significantly to their overall quality and finds extensive applications within the pharmaceutical domain [2]. Over the past three decades, considerable attention has been directed towards the identification and characterization of genes responsible for governing these traits and their ensuing variations [3]. The construction of genetic maps for quantitative trait locus (QTL) analysis has predominantly focused on commercially significant traits like fruit characteristics (shape, length, size, and diameter) [4,5]. Notably, QTL analysis, in conjunction with marker-assisted selection (MAS), furnishes a potent avenue for genetic enhancement. The availability of reference genomes of chili is further facilitating candidate gene analysis, genetic mapping, marker discovery, and effective molecular breeding [6]. The effectiveness of QTLs is contingent upon several factors, including the choice of parental lines, where genetically divergent lines prove advantageous for QTL detection [7]. Moreover, segregating population size and experimental error impact QTL identification, with larger populations detecting QTLs of weaker effects albeit increasing phenotypic screening. Several high-density maps have been constructed in chili pepper, assisted by the advent of refence genome sequence [5]. Technical genomic tools like next generation sequencing (NGS) methods and single nucleotide polymorphism (SNP) identification aids the process of mapping and enhances the ability to pinpoint genomic regions that control important traits. High-throughput sequencing can be applied to sequence the genome partially or completely as SNPs are distributed through the whole genome [8,9]. Despite these advancements, the expense of sequencing entire mapping populations remains relatively high, posing a barrier to economically assessing large numbers of recombinant progenies. Genome reduction techniques like the high throughput sequencing methods such as Restriction Site-Associated DNA (RAD) offer a solution by lowering the genotyping costs for extensive populations. In RAD sequencing, subsampling of the genome occurs to enhance the sequencing efforts for obtaining high-coverage data across many individuals.
Phenotyping and genotyping large collections of organisms can be both labor-intensive and costly. To address this challenge, bulk segregant analysis (BSA) utilizing whole genome sequencing data has been proposed as a cost-effective method for identifying quantitative trait loci (QTL) associated with genetically complex traits [10]. BSA approach typically involves creating a segregating F2 population from an initial cross between two parents with distinct phenotypes. Individuals within this population are then assessed for the trait of interest. Bulked DNA or RNA samples are prepared from groups showing extreme phenotypic differences. Genetic markers are subsequently employed to detect variations between these bulked samples that correlate with the trait of interest. BSA has predominantly found application in crop species for identifying major-effect QTL, such as disease resistance genes, and for mapping qualitative mutations [11].
The main objective of this research was to combine RAD sequencing with BSA and linkage mapping to pinpoint SNPs and QTL associated with two quantitative traits: fruit size (measured as fruit length, fruit diameter, fruit length to diameter ratio, fruit weight) and capsaicin contents. This involved analyzing extreme phenotypes from a segregating chili F2 (Ghotki × Chakwal-4) population.
2. Materials and Methods
2.1. Experimental Layout, Growth Conditions, and Phenotyping
The research was conducted in the research area of Pir Mehr Ali Shah Arid Agriculture University, Rawalpindi, Pakistan (33.65° N, 73.08° E) during the 2020–2022 (parent to F_2_ generation) growing seasons. The study involved the evaluation of a F2 chili plant population obtained from a hybrid cross between the varieties Ghotki and Chakwal-4 that show contrasting fruit size and pungency characteristics (see Results Section 3.1). A collection of 82 progeny, including parents, were utilized. The experiment was carried out in a single year under uniform seasonal and environmental conditions. While this provides clear insight into genetic influences under those conditions, future multi-year studies could help confirm trait stability across environments.
The collected seeds were halo-primed by soaking chili seeds in three percent KNO_3_ solution for 24 h to enhance the germination ability of seeds. After priming, seeds were then air-dried in shade to remove the surface moisture at room temperature. Primed seeds were sown in trays filled with coco-peat in February, and seedlings were then transplanted to the field for morphological characterization and for screening experiments in May. Plants were transplanted individually. Cultural practices including irrigation, hoeing, removal of weeds, and application of NPK fertilizer were applied as per recommendations [12]. The crop was fertilized with N:P:K (20:20:20) solution in split doses.
2.2. Traits Measurement
In F2 populations, each plant represents a distinct genotype, so typical genotype-based replication methods are not applicable. In this study, replication was obtained by recording measurements from several fruits of each plant. Fruits were harvested at the stage of physiological maturity and data regarding fruit length (FL, cm), fruit diameter (FD, cm), and fruit weight (FW, g) were recorded. The ratio of fruit length to fruit diameter (LDR) was employed as an index of fruit shape. Fruit morphology was evaluated for the purpose of categorization, employing a visual assessment conforming to the criteria stipulated by “The World Vegetable Centre (AVRDC)” in Taiwan, China.
Capsaicin contents (PUN) were measured using high-performance liquid chromatography (HPLC). Fruit pungency was measured at ripening stage by using standard high-performance liquid chromatography (HPLC) protocol. Three grams of pepper sample were crushed in a crucible mortar with quartz sand, 50 mL of methanol (analytical grade) was added into the macerate, and then the mixture was transferred to a 100 mL Erlenmeyer flask. The mixture was subjected to 4 min-long ultra-sonication, then filtered through filter paper. The filtrate was purified more by passing through a 0.45 mm PTFE syringe filter before injection on the HPLC column. After suitable dilution, the extract was injected into Nucleodur C18, Cross-Linked HPLC machine (Macherey-Nagel, Duren, Germany). The separation was performed with isocratic elution of 50:50 water–acetonitrile and a flow rate of 0.8 mL/min. Fluorometric detection of capsaicinoid was carried out at wavelength 280 nm and converted to Scoville Heat Units (SHU) based on American Spice Trade Association 1985 pungency units [13].
2.3. Statistical Analysis
The observed data were subjected to analysis of variance (ANOVA). Because there were no multiple seasons involved, “year” was not included as a factor in the statistical model. Furthermore, calculation of ECV = environmental coefficient of variation (%), PCV = phenotypic coefficient of variation (%), GCV = genotypic coefficient of variation (%), H^2^ = heritability (broad sense) (%), GA = genetic advance, GAM = genetic advance as percentage of mean (%), and Spearman correlation [14] were carried out using the R v4.2.2 software (https://cran.r-project.org/bin/windows/base/, accessed on 30 October 2022) based on variation observed during that single growing cycle. Expectations for genetic advance (GA) and genetic advance as a percentage of the mean (GAM) were calculated to assess the expected improvement from selection. These measures were derived using standard formulas that incorporate heritability (H^2^), phenotypic standard deviation (√σ^2^), and selection intensity. To estimate expected genetic advance (GA), selection intensity plays a crucial role. Since the methodology did not specify an exact selection proportion, the calculation used a theoretical selection intensity value that corresponds to selecting the top 5% of individuals. This assumption ensures consistency in evaluating GA and helps predict the expected genetic improvement under targeted selection. All calculations were based on trait data collected during a single growing season under uniform conditions.
2.4. Genome DNA Extraction
The genotyping was conducted in the Dept. of Biosciences, Durham University, Durham, U.K. The study involved an F2 chili plant population obtained from a hybrid cross between varieties Ghotki × Chakwal-4. A collection of 82 progeny, including parents, were utilized.
DNA was isolated by following steps referred to by [15]. Newly emerging fresh leaves (2 g) were collected from each genotype. Leaves for each sample were crushed into powder with liquid nitrogen in a mortar and pestle. Crushed samples were suspended in 2 mL of CTAB lysis buffer (1.4 M NaCl, 2 percent (w/v) cetyltrimethylammonium bromide (CTAB), 100 mM Tris HCl at pH 8.0, 20 mM ethylenediaminetetraacetic acid (EDTA), and 2% v/v 2-mercaptoethanol and incubated at 65 °C for 30 min). The samples were then mixed with an equal volume of chloroform–isoamyl alcohol (24:1) followed by centrifugation at 10,000 rpm for 10 min. Chilled isopropanol was added to the aqueous phase and centrifuged at 10,000 rpm for 10 min. The collected precipitate was washed with 70% ethanol and then was air dried. After drying, the pellet was dissolved in double distilled water and treated with RNase A enzyme (5 µg/mL, New England Biolabs, Ipswich, MA, USA) and kept at 37 °C for 30 min. Then, 5 M sodium chloride (1/10 volumes) and cold ethanol (3 volumes) were added to the solution and centrifuged at 10,000 rpm for 20 min. Finally, after washing the collected pellets with 70% ethanol, the DNA was dissolved in double distilled deionized water. Quality of genomic DNA was evaluated with agarose gel electrophoresis. The concentration and purity were assessed by observing the absorbance ratio at 260:280 nm with a NanoDrop 1000 Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) using 1 µL of each sample.
2.5. ddRAD Sequencing Library Construction
The extracted DNA was subjected to double-digested restriction-associated DNA (ddRAD-seq) protocol according to [16]. A quantity of 100 ng of DNA from each sample was digested using restriction enzymes PstI and EcoRI and a unique paired combination of two inner barcodes ligated using T4 DNA ligase at 30 °C using a thermal cycler for 3 h. Samples were cleaned with Sera-Mag magnetic beads to recover DNA according to [17]. After quantification with a Qubit 4 fluorometer (Thermo Fisher Scientific, Waltham, USA), the samples were pooled into 15 groups according to similar concentration, and each group was indexed with polymerase chain reaction (PCR) to add the second pool-specific paired combination of Illumina outer barcodes. Each pool was cleaned again with magnetic beads to recover DNA. Each group was quantified and pooled equimolarly, and then DNA fragment size selection between 250 and 600 bp was performed with Pippin Prep (Sage Science, Beverly, MA, USA). The library was sequenced on Illumina NovaSeq platform (Illumina Inc., San Diego, CA, USA) with paired-end 150 bp reads performed by Azenta Life Sciences (Manchester, UK), resulting in 750.8M reads in total.
2.6. Quality Checking, SNP Calling and SNP-Indexing Analysis
The fasta file format reads for each sample were quality checked for adapter sequences and low-quality reads using FastQC (Babraham Bioinformatics, Cambridge, UK) and cleaned with Trimmomatic 0.39 (http://www.usadellab.org/cms/?page=trimmomatic, accessed on 30 January 2023) using the recommended quality settings.
The first step for SNP calling was demultiplexing of samples on the basis of inner and outer barcode pairs using software Stacks v2.60 module “process_radtags” (https://catchenlab.life.illinois.edu/stacks/, accessed on 6 February 2023). After that, the reads of each sample were aligned onto the chili reference genome CM334 v1.55 (Sol Genomics Network, https://solgenomics.net/ftp/genomes/Capsicum_annuum/, accessed on 21 January 2026), using Burrows-wheeler Alignment (BWA) v0.7.17 “mem” function (https://bio-bwa.sourceforge.net/, accessed on 6 February 2023). The programs samtools v1.21 (https://github.com/samtools/samtools, accessed on 9 February 2023) and bcftools v1.21 (https://github.com/samtools/bcftools, accessed on 9 February 2023) were used to index the read alignment and generate SNP data that were filtered with base quality ≥ 20, mapping quality ≥ 20, and depth ≥ 3 that were sent to output variant call format (VCF) data consisting of 44084 SNP loci across all samples. The SNPs that segregated in the samples were then extracted by using a customized R v4.3.0 script (https://www.r-project.org/, accessed 3 April 2023). The SNPs indexing was performed by identification of homozygous SNPs in parental lines and high-quality SNPs that segregate and have minimum sequence read depths of >100 across all samples were selected. Each SNP genotype was assigned 0 or 1 compared to the CM334 reference. Following these criteria, 534 SNPs were retained for subsequent analysis.
2.7. Bulk Segregant Analysis and QTL Mapping
Bulk Segregant Analysis (BSA) was applied to each measure using the selected SNPs. The uppermost (top) and lowermost (bottom) 25% of individuals were chosen from population with respect to the given trait. The differentiation statistic (Z′) was calculated using the following formulae [18]:
For each SNP (j), in the total number of samples (d), the estimated allele frequencies (pj(top) and pj(bottom)) were derived from the observed number of reads (nj(top) and nj(bottom)) across the pool corresponding to the traits measured. The expected allele frequency (p’j) was computed using the counts of reads for the allele (xj(top) and xj(bottom)) divided by the total observed reads (nj(top) and nj(bottom)) for that SNP. These selected individuals were then subjected to a two-sided Z-test, resulting in the computation of Z-values. These Z-values were subsequently translated into corresponding p-values.
A genetic map was developed and utilized in the IciMapping software version 3.3, as detailed in the work by [19]. A permuted LOD threshold was established for each trait by performing 1000 permutations of a single QTL model without any covariates. To define the boundaries of significant regions, LOD intervals of 1.5 were applied. Genetic positions were used for mapping, and these were converted to physical positions for comparison with the BSA sequencing approach. The study then identified overlapping regions between quantitative trait loci (QTL) detected for different traits. A region was considered overlapping if the physical positions of a 1.5 LOD interval for two traits in population fell within the boundaries identified by the BSA sequencing method.
3. Results
3.1. Phenotypic Evaluation
We evaluated the phenotypic variation in five important traits—fruit length (FL), fruit diameter (FD), length-to-diameter ratio (LDR), fruit weight (FW), and pungency (PUN)—in the mapping population and their parental lines (Table 1, Figure 1, Table S1). A comparative analysis of phenotypic and genetic variability parameters for the two parental genotypes, Ghotki and Chakwal-4, revealed substantial differences across all measured traits. Mean performance indicated a clear phenotypic contrast, with Ghotki producing markedly larger fruits. Fruit length (FL) averaged 6.95 cm in Ghotki compared to 5.65 cm in Chakwal-4, while fruit diameter (FD) was also similar (5.05 vs. 5.00 mm). This translated into higher fruit weight (FW), where Ghotki recorded 7.25 g per fruit, heavier than Chakwal-4 (5.58 g). Conversely, Chakwal-4 possessed considerably lower pungency, averaging 8077 SHU-equivalent units, compared to 21102 in Ghotki, highlighting the expected hot × mild parental contrast for capsaicin inheritance.
All traits showed transgressive segregation, meaning some offspring exceeded the parents, which indicates a strong genetic potential for improvement (Figure 1). FL varied widely from 1.86 to 15.55 cm, with an average of 6.99 cm. Its broad-sense heritability (H^2^ = 0.98) and high expected genetic advance (5.25 cm) suggest that this trait is largely controlled by genetics and can be effectively selected in breeding. FD ranged from 0.8 to 17.5 cm, biased towards smaller values (mean 4.5 cm), and also showed very high heritability (0.99) and expected genetic advance (6.35 cm), indicating strong genetic control. LDR was highly variable and biased towards smaller values (0.69–9.75, mean 2.39), with H^2^ of 0.96, suggesting that fruit shape can be reliably selected. FW ranged from 3 to 7.67 g (mean 5.32 g), with moderate heritability (0.79) and expected genetic advance (1.2 g), reflecting both genetic and environmental influence. PUN showed wide variation biased towards high values (5,100–20,632 SHU, mean 15,474), very high heritability (0.99), and expected genetic advance (9.28 × 10^3^ SHU), indicating it is highly heritable and can be efficiently improved through selection (Table 1). In all traits, genotypic and phenotypic variances were higher than environmental variance, confirming that the differences were mainly due to genetic factors rather than growing conditions. ANOVA results showed significant variation among genotypes (p < 0.05) and significant genotype × environment interaction (p = 0.01), highlighting that both genetics and environment influence trait expression.
3.2. Trait Correlation
Spearman correlation analysis was used to examine the relationships among key fruit traits, including fruit length (FL), fruit diameter (FD), length-to-diameter ratio (LDR), fruit weight (FW), and capsaicin content (PUN) (Figure 2). Overall, most traits showed weak correlations, indicating that changes in one trait do not strongly or consistently affect the others. This suggests that fruit size, shape, and pungency are largely regulated by different genetic factors. FD was negatively associated with LDR (r = −0.74), indicating that fruit diameter plays a greater role in determining fruit shape than length. A positive relationship was observed between FL and FD (r = 0.46, p < 0.01), showing that longer fruits are often broader as well. In contrast, fruit size traits showed no correlation with PUN,. This pattern may be explained by fruit anatomy, as capsaicin is mainly produced in placental tissues, which may not increase proportionally with fruit size. Traits such as LDR and FW showed little to no meaningful relationship with PUN, further supporting the idea that capsaicin content is largely independent of external fruit dimensions. Collectively, these results highlight important trade-offs between fruit morphology and pungency, offering valuable insights for breeding programs targeting specific combinations of size, shape, and heat level.
3.3. Bulk Segregant Analysis and Genetic Maps
The SNP genotypes for the mapping family are presented in Table S2. The BSA methodology was applied to Ghotki × Chakwal-4 F_2_ population with a negative logarithm (base 10) p value threshold ranging from 3.50 to 4.1 per trait, leading to the identification of a total of 39 genomic regions that showed substantial associations with the target trait. Notably, the evaluation of FL identified 99 SNPs of high significance, distributed along a genetic map 6607.48 cM long. Similarly, the examination of FD discovered 100 highly significant SNPs on a genetic map measuring 4315.63 cM. Furthermore, the analysis of LDR unveiled 101 SNPs of high significance, present on a genetic map length of 5997.67 cM. In addition, the evaluation of FW trait resulted in 139 significant SNPs positioned on a genetic map length of 5335.37 cM. Lastly, the genetic map for PUN showed a length of 5982.34 cM, with a total of 291 highly significant SNPs (Table 2).
3.4. QTL Identification
Three FL QTL were found on linkage groups 1 and 6. qFL-1-1 was detected on linkage group 1 with an LOD value of 2.87 explaining 9.7% of the phenotypic variation to the traits. The chromosomal position was measured 175 cM near telomeric region. Two QTLs, qFL-6-1 and qFL-6-2, were identified on linkage group 6 with LOD values of 2.6 and 2.5 contributing 15.4% and 13.6% of the phenotypic variation, respectively. These QTLs were found at physical position mapped at 1356.1 cM and 1573.1 cM (Table 3, Figure 3A).
Two FD QTLs were found on chromosomes 3 and 12 with 21.7% and 26.9% percentage variation in phenotype, respectively. qFD-3-1 and qFD-12-1 were mapped on linkage group 3 and 12 with the physical position of 337.7 and 949 cM, respectively. The LOD values were 3.96 and 2.99, respectively (Table 3, Figure 3A).
Two LDR QTLs were present on linkage group 2 that had recorded LOD values of 5.4 and 5.0, respectively. The physical positions were 2735.4 cM and 2777.4 cM, respectively. The QTLs had 15.7% and 15.6% contribution towards phenotype (Table 3, Figure 3A).
There were 15 FW QTLs with an LOD value range between 2.6 and 4.6 and 10.5% to 17.3% contribution in phenotypic variation detected. qFW-2-1 and qFW-2-2 that were present on linkage group 2 had a physical position of 302.3 cM and 3050.3 cM. Six QTLs were found on linkage group 6, qFW-6-1, qFW-6-2, qFW-6-3, qFW-6-4, qFW-6-5, and qFW-6-6. They were present between the physical region between 210.7 cM and 940.7 cM. qFW-7-1 and qFW-7-2 were mapped on linkage group 7 with the physical positions of 899.4 cM and 960.4 CM. qFW-8-1 was mapped on linkage group 8 with the physical position 57.5 cM. Two QTLs, qFW-12-1, qFW-12-2, were mapped on linkage group 12, with physical positions between 208.0 cM and 507.0 cM (Table 3, Figure 3B).
There were 12 PUN QTLs detected for capsaicin contents. qPUN-1-1 and qPUN-1-2 were found on linkage group 1 and have physical positions 3786.8 c and 3869.8 cM, respectively. qPUN-5-1 and qPUN-8-1 were found on linkage group 5 and 8 with physical positions of 1523 and 1100.1 cM, respectively. qPUN-6-1 and qPUN-6-2 were detected on linkage group 6 with physical positions of 5710.1 cM and 5802.1 cM, respectively. Four QTLs, qPUN-11-1, qPUN-11-2, qPUN-11-3, and qPUN-11-4, were mapped along linkage group 11 with physical positions between 1460 cM and 2327 cM. qPUN-12-1 and qPUN-12-2 were mapped along linkage group 12 and had physical positions of 2917.1 cM and 2992.1 cM. All QTLs contribute from 11.2 to 16.1% phenotypic variation, having LOD values from 3.6 to 4.1 (Table 4, Figure 4).
Overlapping QTL region analysis found that qFL-6-1 and qFL-6-2 in FL overlapped with qFW-6-6 for FW. They shared the same genomic region between the marker 218780813 to 221102478 on chromosome 6. Similarly, qFW-6-2 and qFW-6-3 for FW overlapped with qPUN-6-1 and qPUN-6-2 for capsaicin contents, sharing the genomic region of 100989762 to 138660974 on chromosome 6. Based on this criterion, three overlapping regions on chromosomes 6 were identified for FL, FW, and capsaicin content and potentially had a pleiotropic effect.
3.5. BSA Strategy and QTL Mapping Analysis
In this BSA (Bulked Segregant Analysis), a range of markers were associated with fruit traits. The QTL analysis identified key genomic regions linked to important traits in chili, characterized by their ΔSNP-index values and specific SNP positions (Table 5). For PUN, multiple SNPs across chromosomes 1, 5, 6, 8, 11, and 12 exhibited strong associations. For example, SNP 41308232 showed a ΔSNP-index of −1, indicating a strong negative influence, while SNPs like 70429871 and 104377148 revealed highly positive values (0.97 and 1.0), reflecting potent trait-enhancing alleles. Fruit length-related QTLs included SNP 92509300 with a notably negative index of −0.9375 and SNP 218780813 with a mild positive value around 0.072, suggesting subtle enhancement effects. SNPs related to fruit diameter showed moderate positive indices, while those influencing the length-to-diameter ratio—such as SNP 160989038—revealed a negative ΔSNP-index of −0.29, promoting elongated fruit shape. The fruit weight trait exhibited several QTLs, with SNPs like 58288681, 100989762, and 138660974 showing consistent positive ΔSNP values between 0.035 and 0.347, indicating their utility in selecting heavier fruit genotypes. Importantly, several SNPs including 218780813 and 138660974 appeared across multiple traits, suggesting pleiotropic roles or tightly linked genomic effects, which could be valuable for multi-trait breeding strategies. This combined BSA and QTL analysis provides critical insight into the genetic architecture of chili traits, supporting targeted marker-assisted selection for crop improvement.
4. Discussion
Fruit size and pungency are the major traits contributing to the yield and quality of chili. Fruit size is a complex trait and polygenic in nature [3]. Fruit size depends on fruit length, fruit diameter, fruit length to diameter ratio, and fruit weight. Continued selection and utilization in the domestication of chili have given rise to a wide array of evolutionary transformations, resulting in the divergence of cultivated chili from their wild counterparts [6,7,20]. Previous research has extensively explored fruit traits, such as weight, length, diameter, and capsaicin contents, identifying numerous QTLs associated with fruit and plant developmental characteristics that have significantly advanced our understanding of their underlying mechanisms [4,5,7,21].
The feasibility of BSA with whole genome sequencing has been applied to other agronomic systems like rice [22] and cotton [11]. These studies have demonstrated the detection of QTL using next-generation sequencing on pooled samples. Methods employing BSA with other next-generation sequencing technologies, such as RNAseq, have also successfully mapped genes contributing to quantitative traits in important crops [23].
This study employed BSA sequencing to detect QTL associated with fruit length (FL), fruit diameter (FD), fruit length to diameter ratio (LDR), fruit weight (FW), and capsaicin contents (PUN) in a F2 population derived from Ghotki × Chakwal 4. The extreme 25% of the individuals for each trait were used to identify the associated SNPs and to provide a detailed overview of the genomic regions associated with key traits in chili, showing how changes at specific SNPs contribute to diversity in fruit characteristics. The strong negative ΔSNP-index at SNP 41308232 suggests this variant may significantly reduce pungency, making it a useful marker for breeding milder chili cultivars. On the other hand, SNPs 70429871 and 104377148, with high positive ΔSNP-index values, appear to enhance pungency, indicating genetic influences on capsaicinoid accumulation. Fruit length was also linked to SNPs with both negative and positive effects. SNP 92509300 showed a strong negative value, pointing to shorter fruit development, while SNP 218780813 showed a mild positive effect, suggesting it could help produce longer fruits. Interestingly, SNP 218780813 also influenced fruit weight, indicating it might affect multiple traits and could be especially useful for breeders. Fruit diameter was influenced by SNPs like 5578366 and 13324952, which had moderate positive values, suggesting gradual increases in fruit width. For fruit shape, SNP 160989038 had a clear negative effect on the length-to-diameter ratio, resulting in longer, narrower fruits, traits that may suit certain market preferences. Fruit weight displayed the most diverse SNP signals, with several markers such as 100989762 and 138660974 consistently showing positive ΔSNP values. These could help in selecting heavier fruits. Meanwhile, SNPs with negative indices, like 154826709, may be valuable for producing lighter fruit types suited to specific consumer demands. Several SNPs, especially 218780813 and 138660974, were associated with more than one trait, suggesting they influence multiple characteristics and could be powerful tools in multi-trait selection. Their consistent presence reinforces their potential use in marker-assisted breeding for chili improvement. Overall, these results offer strong molecular evidence for targeting specific SNPs in chili to develop varieties with desired traits. These insights can guide efficient breeding programs focused on fruit size, shape, weight, and pungency.
In this study, QTL mapping was also performed in combination with phenotypic evaluation for fruit size and capsaicin content. Much variation was observed in phenotype of the fruit length, fruit width, fruit weight, and capsaicin content by use of contrasting parents with high heritability, and limited environmental influence was observed. The transgressive inheritance patterns of fruit length to diameter ratio underscored the prominence of recessive genes, aligning with prior findings in chili and numerous other crops [7].
Previous research has demonstrated that for traits influenced by numerous genes with moderate effects, larger mapping populations enhance statistical power and reduce the risk of overestimating QTL effects [10]. Furthermore, populations with greater recombination rates between genotypes are expected to disrupt linkage phases of repulsion and coupling leading to more sensitive QTL detection. These findings align with recent research involving large association panels of chili species, which highlight the complex, polygenic nature of traits like fruit length, fruit diameter, fruit length to diameter ratio, fruit weight, and capsaicin contents [24,25]. The use of large populations combined with recombination enhances the ability to elucidate these genetic complexities.
Chili fruit traits, including plant height, fruit weight, fruit shape, and flowering, may exhibit variation directly influenced by selective pressures during domestication [6,7]. However, for certain traits, the observed variation might have arisen indirectly, as seen in organ pigmentation, plant architecture, and leaf size/shape traits. The initial examination of fruit-related QTLs dates back to 2001 when a cross between two C. annuum genotypes, Maor (bell-type chili) and Perennial (small-fruited line), revealed multiple QTLs distributed on linkage groups 2, 3, 4, 8, and 10, indicating their pivotal role in developmental processes [26]. Subsequent studies using advanced techniques in chili have detected more fruit-related QTLs in different chili species. Notably, QTLs were most dense on linkage groups 3, 6, 8, 11, and 12 [4,5]. In the present study, we identified 25 QTLs associated with traits related to fruit size and shape, with 8 of them clustered within a shared region on chromosome 6 (210.7–1573.1 cM).
When comparing these QTL for different traits, two overlapping regions were found for fruit length and two for capsaicin contents. Prior studies indicate that some markers were linked to control of more than one trait in our study, indicating the pleiotropic effects of some QTLs [4,5,7]. Given the documented phenotypic correlations among chili fruit traits in both the current and past studies, the identification of QTLs with pleiotropic effects was anticipated. However, no QTL was discovered that was linked to both fruit size and shape index, despite the observed correlation between these traits. This discrepancy might be attributed to the limited number of QTLs for fruit size and fruit shape in our study. Chili is preferred to be used in green stage as vegetable, and accumulation of major QTLs at the early stage could be desirable in breeding strategy.
Extensive genetic studies in Capsicum annuum have identified multiple loci and candidate genes controlling fruit size and capsaicin accumulation, providing an important framework for interpreting the loci detected in the present study. Fruit size-related traits such as fruit length, diameter, and weight are polygenic and are mainly regulated by genes involved in cell division, cell expansion, and fruit shape determination [27,28]. Major loci reported across different populations include IQ domain family and Ovate Family Protein (OFP) genes, such as CaSUN and CaOVATE, which influence fruit elongation and overall shape [29,30]. The fruit weight QTL qFW2.2 of this study appears to be close to CaOvate, similar to findings of other genetic association studies [4,21,27], while the fruit diameter QTL qFD3.1 is close to CaKLUH reported by others [4]. Stable QTLs for fruit weight have been reported across chromosomes suggesting the presence of multiple conserved genomic regions underlying fruit morphology. Several fruit size-related QTLs identified in the present study fall within or near these previously reported regions on chromosomes 2, 3, 6, 8, and 12 [27,31,32]. Differences between studies might be due to population-specific effects or the need to resolve sub-regions through higher marker density.
Capsaicin content is primarily governed by genes involved in the capsaicinoid biosynthetic pathway and its transcriptional regulation. In this study we identify 12 QTLs for capsaicin content that are linked closely to capsaicinoids synthesis pathway genes. The hypothesis suggests that capsaicinoids evolved as a deterrent to mammal herbivory. Birds, lacking vanilloid receptors, do not experience a painful reaction to these compounds [33]. The SNP markers within this investigation could be applied through hybridization to further chili cultivars. Significantly, the independent selection of capsaicin and dihydrocapsaicin QTLs is pivotal for the development of profoundly pungent chili cultivars [34]. Some of the 12 identified QTLs associated with capsaicinoids may have been previously reported. Major QTLs for chili pungency include C/pun1/AT3 on chromosome 2 [6,35,36], cap/pun2 on chromosome 7 [37,38], and CaMYB31/pun3 on chromosome 7 [39]. These QTLs were not picked up in the current study, reflecting that several genes at locations other than the aforementioned QTLs, such as BCAT on chromosome 4 [36], pAMT on chromosome 3 [40], and CaKRI on chromosome 10 [41], have also been implicated in pungency variation in chili. Further chili pungency QTLs have been identified on chromosomes 1, 6, and 12 that could match with QTLs found in this study [36,42]. Our mapping study revealed some potentially novel QTLs associated with capsaicin on chromosomes 5, 8, and 11.
Loci consistently detected by both analytical approaches in this study represent high-confidence genomic regions for downstream applications. These stable loci can be prioritized for the development of tightly linked SNP or InDel markers, facilitating efficient marker-assisted selection (MAS) for improved fruit size and tailored pungency levels [38]. Such markers will enable breeders to combine desirable fruit morphology with targeted capsaicin content while minimizing unfavorable trade-offs. Further validation of these loci in independent populations and diverse genetic backgrounds will be essential to confirm their stability and utility in chili molecular breeding programs.
Pungency traits exhibited no correlations with fruit traits. Similarly to the most pungent parent, the most pungent fruits tended to have narrow fruits and small leaves. The relationships observed between capsaicin and fruit traits raises the possibility that the genetic elements influencing fruit size may also play a role in controlling capsaicin content. Under equal conditions, it is conceivable that a narrow fruit with a thin pericarp might contain more concentrated capsaicinoids compared to a wide fruit with a thick pericarp.
5. Conclusions
In conclusion, increase in fruit size and pungency in the context of chili cultivation holds significance for both yield and quality considerations. The complex nature of fruit size, governed by polygenic traits, depends on fruit length, fruit diameter, and their ratio. Through extensive utilization of chili species and ongoing domestication activities, substantial evolutionary changes from wild progenitors have been achieved, shaping the trajectory of chili farming. This research evaluated high-throughput genotyping in combination with BSA methods to conduct comprehensive phenotypic evaluations to examine fruit size and capsaicin content variations. Positive correlations were discerned among various fruit-related attributes, including length, diameter, and shape. By selecting the extreme of individuals exhibiting desired traits in both populations, this study demonstrated significant variations in fruit dimensions and capsaicin content, facilitated by distinct parental varieties. The identification of 534 SNP markers contributed to the construction of a 6607 cM genetic map.
The study pinpointed 25 QTLs linked to fruit characteristics and 12 QTLs linked with capsaicin contents, with a particular concentration in specific linkage groups. BSA successfully pinpointed high ΔSNP-index markers linked to key traits like fruit length, diameter, weight, and pungency, underscoring their potential applicability through hybridization for further advancement of chili cultivars. The overlapping loci between approaches offers opportunities for trait enhancement. These findings reinforce previous reports and provide solid molecular targets for efficient breeding strategies in chili. Genotypic investigations revealed markers governing multiple traits, uncovering pleiotropic impacts of some QTLs. Thus, this research contributes valuable insights into the complex nature of fruit size and pungency, paving the way for targeted breeding efforts and transformative advancements such as fine mapping, functional validation, and genetic editing in chili cultivation.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Pavani N. Barik S. Ponnam N. Reddy M.K. Genetic Diversity for Fruit Quality Traits in Chili (Capsicum spp.)Peppers Variyar P.S. Singh I.P. Adiani V. Suprasanna P. CRC Press Boca Raton, FL, USA 20251624
- 2Hernández-Pérez T. Gómez-García M.D. Valverde M.E. Paredes-López O. Capsicum annuum (hot pepper): An ancient Latin-American crop with outstanding bioactive compounds and nutraceutical potential. A review Comp. Rev. Food Sci. Food Saf.2020192972299310.1111/1541-4337.1263433337034 · doi ↗ · pubmed ↗
- 3Islam M.S. Akter N. Jui S. Variability of chili (L.) genotypes for yield and yield attributes Asian J. Res. Bot 202033337
- 4Chunthawodtiporn J. Hill T. Stoffel K. Van Deynze A. Quantitative trait loci controlling fruit size and other horticultural traits in bell pepper (Capsicum annuum)Plant Genome 20181116012510.3835/plantgenome 2016.12.0125 PMC 1289912629505638 · doi ↗ · pubmed ↗
- 5Han K. Jeong H.J. Yang H.B. Kang S.M. Kwon J.K. Kim S. Choi D. Kang B.C. An ultra-high-density bin map facilitates high-throughput QTL mapping of horticultural traits in pepper (Capsicum annuum)DNA Res.201623819110.1093/dnares/dsv 03826744365 PMC 4833416 · doi ↗ · pubmed ↗
- 6Qin C. Yu C. Shen Y. Fang X. Chen L. Min J. Cheng J. Zhao S. Xu M. Luo Y. Whole-genome sequencing of cultivated and wild peppers provides insights into Capsicum domestication and specialization Proc. Natl. Acad. Sci. USA 20141115135514010.1073/pnas.140097511124591624 PMC 3986200 · doi ↗ · pubmed ↗
- 7Lopez-Moreno H. Basurto-Garduño A.C. Torres-Meraz M.A. Diaz-Valenzuela E. Arellano-Arciniega S. Zalapa J. Sawers R.J.H. Cibrián-Jaramillo A. Diaz-Garcia L. Genetic analysis and QTL mapping of domestication-related traits in chili pepper (Capsicum annuum L.)Front. Genet.202314110140110.3389/fgene.2023.110140137255716 PMC 10225550 · doi ↗ · pubmed ↗
- 8He J. Zhao X. Laroche A. Lu Z.-X. Liu H. Li Z. Genotyping-by-sequencing (GBS), an ultimate marker-assisted selection (MAS) tool to accelerate plant breeding Front. Plant Sci.201450048410.3389/fpls.2014.0048425324846 PMC 4179701 · doi ↗ · pubmed ↗
