Limitations of Single Prediction Tools in miRNA Profiling of Grapevine Viral Coinfection
Katja Jamnik, Hana Šinkovec, Jernej Jakše, Vanja Miljanić, Nataša Štajner

TL;DR
This study compares different tools for predicting miRNA expression in grapevines infected with multiple viruses and finds that combining tools improves results.
Contribution
The study reveals how viral coinfections affect miRNA expression in grapevines and highlights the limitations of single prediction tools.
Findings
Three miRNA prediction tools identified largely different sets of differentially expressed miRNAs.
An integrated approach combining all three tools yielded more consistent and biologically meaningful results.
Experimental validation confirmed the differential expression of several miRNAs.
Abstract
Background/objectives: Grapevine (Vitis vinifera L.) is one of the most economically and culturally important fruit crops worldwide and hosts more than 100 viruses. Viral infections can cause severe yield losses, but plants can adapt to infection through changes in miRNA-mediated regulatory pathways. MicroRNAs are key regulators of plant development and stress responses. Several prediction tools are available for miRNA detection from small RNA sequencing data, each relying on different algorithms. The aim of this study was to compare miRNA predictions generated by three widely used tools (miRador, ShortStack, and miRDeep2) and to evaluate how viral coinfections influence miRNA expression in grapevine. Methods: Two grapevine cultivars, Refošk (“Terrano”) and Zeleni Sauvignon (“Sauvignon Vert”), were analyzed. Small RNA sequencing was performed on virus-free plants and plants coinfected…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7- —Slovenian Research and Innovation Agency (ARIS)
- —young researcher scholarship grant
- —Genetics and Modern Technologies of Crops awarded to the Chair of Genetics, Biotechnology, Statistics and Plant breeding at the University of Ljubljana, Biotechnical Faculty
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPlant Molecular Biology Research · Plant Virus Research Studies · Plant and Fungal Interactions Research
1. Introduction
Small RNAs (sRNAs) are important regulators of gene expression in plants and play essential roles in plant development, reproduction and defence responses. Plant sRNAs are classified into two major groups: small interfering RNAs (siRNAs) and microRNAs (miRNAs). MicroRNAs are non-coding RNAs, typically 20–24 nucleotides long, that function by guiding post-transcriptional silencing of target genes [1,2]. Plant miRNAs are encoded by MIRNA (MIR) genes and transcribed by RNA polymerase II into primary miRNA transcripts (pri-miRNAs) that fold into characteristic hairpin structures. Pri-miRNAs are then processed by a dicing complex, composed of nuclear endonuclease DICER-LIKE 1 (DCL1), the double-stranded RNA-binding protein HYPONASTIC LEAVES 1 (HYL1) and the zinc-finger protein SERRATE (SE), into precursor miRNA (pre-miRNA). The pre-miRNA is further cleaved by the dicing complex to release miRNA/miRNA* duplex, which are then methylated by the methyltransferase HUA ENHANCER 1 (HEN1). The mature miRNA strand is loaded into ARGONAUTE 1 (AGO1) to form the RNA-induced silencing complex (RISC). MicroRNAs guide the RISC to complementary target mRNA, leading to either transcript cleavage or translation inhibition [3,4].
Plant miRNAs are recognized not only as regulators of plant development, but they also play crucial roles in stress responses. Numerous studies have demonstrated that various abiotic stress conditions, such as water stress, temperature extremes and heavy metal exposure, can lead to significant changes in miRNA expression across different species [5,6,7,8,9,10,11,12,13,14,15,16]. Similarly, biotic stressors including bacterial, fungal and viral infections have all been shown to alter the expression of specific miRNAs involved in plant defence and immune regulation [17,18,19,20,21,22,23]. Many studies have explored the impact of viral infections on miRNA expression in several plant species. These studies demonstrate that viral infections can lead to both upregulation and downregulation of various miRNAs involved in essential biological functions, including the regulation of genes associated with the plant immune response [24,25,26,27,28,29].
Grapevine (Vitis vinifera L.) is one of the most important fruit crops in the world. According to the International Organisation of Vine and Wine (OIV), it is cultivated on over 7 million hectares globally and plays a central role in both agriculture and cultural heritage. The production of grapevines is threatened by many pathogens, including viruses. To date, over 100 viruses have been identified in grapevine, and many of them can cause significant economic losses due to reduced yield and fruit quality [30,31]. In grapevine, miRNA expression has been shown to respond to viral infection. Several studies have investigated miRNA responses to viruses such as grapevine leafroll-associated virus-3 (GLRaV-3), grapevine vein clearing virus (GVCV) and grapevine rupestris stem pitting-associated virus (GRSPaV). These studies reported numerous miRNAs to be differentially expressed upon infection, suggesting that miRNA-mediated regulation plays an important role in grapevine’s response to viral pathogens [32,33,34,35].
High-throughput sequencing of small RNAs (sRNA-seq) has become the standard method for profiling miRNA expression; however, the computational prediction and annotation of plant miRNAs remain complex [36,37]. Multiple bioinformatic pipelines exist, each with distinct algorithms and selection criteria, often producing different results. These discrepancies can arise from differences in read mapping strategies, criteria for hairpin structure prediction, or the handling of low-abundance sequences. As a result, comparative analyses often reveal only partial overlap between pipelines, making experimental validation of candidate miRNAs essential to accurately characterize their expression patterns and functional roles in stress responses.
In this study, we analyzed miRNA expression in two grapevine cultivars, Refošk (“Terrano”) and Zeleni Sauvignon (“Sauvignon Vert”), coinfected with grapevine Pinot gris virus (GPGV), grapevine rupestris stem pitting-associated virus (GRSPaV), and grapevine rupestris vein feathering virus (GRVFV). Although most previous studies have investigated host responses to single viral infection, the present work provides insight into regulatory processes occurring in more complex infection conditions. Since miRNA detection in sRNA sequencing data depends strongly on the prediction tool used, we applied three widely used tools—miRador v1.0 [38], ShortStack v4.0.2 [39] and miRDeep2 v2.0.0.8 [40]—to analyze the same data and compare their outputs. Differential expression analysis was carried out separately for the miRNA prediction output of each tool and also on the combined dataset integrating all outputs. This approach enabled us to evaluate how tool selection and data integration influence the detection and differential expression of miRNAs.
2. Materials and Methods
2.1. Plant Material
Young leaves from three biological replicates of virus-free plants and three biological replicates of virus-infected plants for each cultivar, Zeleni Sauvignon (“Sauvignon Vert”) and Refošk (“Terrano”), were collected and ground in liquid nitrogen. Samples were stored at −80 °C until further analysis. The two cultivars were chosen to represent traditional Slovenian grapevine germplasm. Importantly, grapevine plants free of viral infections are difficult to obtain under natural conditions. Virus-free plants of these two cultivars were available from previous studies, allowing high-quality genomic analysis without interference from virus-associated variation; they were obtained through in vivo thermotherapy and in vitro meristem-tip micrografting, followed by acclimatization, as described by Miljanić et al. [41]. Virus-infected plants were collected from vineyards of the clonal center in Vrhpolje (STS; Vipava Valley, Primorska wine-growing region, Slovenia). Woody cuttings were harvested in early spring, forced to budburst in water, and young leaves were subsequently collected for analysis.
2.2. RNA Extraction and Virus Detection
Total RNA was isolated from 100 mg of frozen plant material using Monarch Total RNA Miniprep Kit (New England Biolabs, Ipswich, MA, USA). The concentration of each RNA sample was determined using a Nano Vue Plus spectrophotometer (GE Healthcare Life Science, Little Chalfont, Buckinghamshire, UK), and RNA integrity was assessed on an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Based on previous work showing that 6 viruses and 2 viroids were present in cultivars Refošk and Zeleni Sauvignon [42], we screened for the same set of pathogens. Two-step RT-PCR was performed to test for the presence of viruses, including GRSPaV, GPVG, GRVFV, GLRaV-3, grapevine fanleaf virus (GFLV), and grapevine fleck virus (GFkV), as well as the viroids, hop stunt viroid (HSVd) and grapevine yellow speckle viroid 1 (GYSVd-1). Total RNA was reverse transcribed to cDNA using High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems, Thermo Fisher Scientific, Waltham, MA, USA) according to the manufacturer’s instructions, followed by PCR with specific primers (listed in Table S1). The PCR mixture contained 4 µL of 5× PCR buffer (Promega Corporation, Madison, WI, USA), 1.6 µL of 25 mM MgCl_2_ (Kapa Biosystems, Wilmington, MA, USA), 1.6 µL of dNTP (10 mM each dNTP; Promega, Madison, WI, USA), 0.5 µL of each specific primer (10 µM), 0.1 µL of KAPA Taq Polymerase (Kapa Biosystems, Wilmington, MA, USA), 10.7 µL of nuclease-free water and 1 µL of cDNA. PCR products were analyzed on 1.2% agarose gel.
2.3. sRNA Library Construction and Sequencing
Small RNAs were extracted from 100 µg of leaf tissue of virus-free and virus-infected plants of both cultivars using mirVana™ miRNA Isolation Kit (Life Technologies, Carlsbad, CA, USA) according to the manufacturer’s instructions. The quantity and quality of extracted sRNAs were assessed on an Agilent 2100 Bioanalyzer using Bioanalyzer Agilent^®^ Small RNA Kit (Agilent, Santa Clara, CA, USA). Small RNA libraries were prepared using Ion Total RNA-Seq Kit v2 (Ion Torrent, Life Technologies, Carlsbad, CA, USA) following the manufacturer’s instructions, and the quality of barcode-labeled cDNA libraries was analyzed on Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Small RNA sequencing was conducted with an Ion Proton Sequencer according to the manufacturer’s instructions (Ion Torrent, Life Technologies, Carlsbad, CA, USA).
2.4. Bioinformatic Analysis of sRNA Data and miRNA Prediction
Prior to bioinformatics analysis, barcodes and adapters, as well as low-quality raw sequence reads, were removed to retain only high-quality sequencing reads for further analysis. The VirusDetect tool v1.7 [43] was applied to identify viruses present in the samples based on sRNA sequencing data. For miRNA prediction, three miRNA prediction tools were used: miRador v1.0 [38], ShortStack v4.0.2 [39] and miRDeep2 v2.0.0.8 [40]. All analyses were conducted using default analysis settings. Annotation was performed using miRBase v.22.1 database [44], and only known miRNAs present in the database were retained for further analysis.
2.5. Exploratory Data Analysis
Outputs from the three prediction tools were analyzed both separately and in combination using an integrated approach. The overlap of miRNAs identified by the different tools was visualized using Venn diagrams.
To explore the structure of the combined dataset, we first applied a variance-stabilizing transformation. Data quality was assessed by graphical inspection of the transformed expression levels using heatmaps. Pre-filtering was performed by retaining miRNAs with at least 1 count-per-million reads in a minimum of 5 samples. To assess sample similarity, we generated a heatmap of the Euclidean distance matrix using complete linkage hierarchical clustering for both rows and columns. To remove systematic differences because different tools that were used for miRNA prediction, we applied the removeBatchEffect function from the R v4.5.1 [45] package limma v3.66.0 [46] to correct for this source of variation. Sample-to-sample distances were then re-evaluated after batch correction. Spearman correlation matrices were computed separately for each cultivar and visualized using correlograms. Finally, principal component analysis (PCA) was performed to investigate the overall variation in miRNA expression with respect to the effects of condition and cultivar.
2.6. Differential Expression Analysis
Differential expression analysis was performed after pre-filtering, both separately for each prediction tool and on the combined dataset. For the separated analyses, miRNAs were retained if they had at least 1 count-per-million reads in a minimum of 3 samples; for the integrated analysis, the threshold was increased to at least 1 count-per-million reads in a minimum of 5 samples. For the separate analyses, we used the DESeq2 v1.50.2 package [47] in R v4.5.1 to identify differentially expressed miRNAs between infected and virus-free samples in each cultivar, as well as to test for the interaction between cultivar and condition. The integrated analysis followed the edgeR v4.8.2 [48]–limma v3.66.0 [46] workflow, controlling for systematic differences among prediction tools and accounting for the correlation arising from multiple prediction-tool outputs generated from the same biological samples. Calculated p-values were adjusted for multiple testing using the Benjamini–Hochberg procedure to control the false discovery rate (FDR). miRNAs with an FDR < 0.05 were considered statistically significant and considered differentially expressed. For exploratory purposes, we also examined miRNAs from the integrated analysis using a less stringent threshold of non-adjusted p-value < 0.01.
2.7. RT-qPCR for miRNA Expression
For the detection and expression of miRNAs, stem-loop RT-qPCR was performed as described by Varkonyi-Gacis et al. [49]. Stem-loop RT primers and forward qPCR primers were designed based on mature miRNA sequences (Table S2). Before reverse transcription, 200 µg of total RNA, nuclease-free water (up to 10 µL), 1 µL of specific stem-loop primer (1 µM) and 0.5 µL of dNTP mixture (10 mM) were combined. The mixture was heated to 65 °C for 5 min and incubated on ice for 2 min. Then, 8.5 µL of RT mixture (4 µL of 5× First-strand buffer (Invitrogen™,Thermo Fisher Scientific, Waltham, MA, USA), 0.25 µL of SuperScript™ III Reverse Transcriptase (Invitrogen™, Thermo Fisher Scientific, Waltham, MA, USA), 2 µL of 0.1 M DTT (Invitrogen™, Thermo Fisher Scientific, Waltham, MA, USA), 2 µL of 25 mM MgCl_2_ and 0.25 µL of RNase Inhibitor (Applied Biosystems™, Thermo Fisher Scientific, Waltham, MA, USA) were added. Reverse transcription was performed in a thermo cycler: 30 min at 16 °C, followed by 60 cycles at 30 °C for 30 s, 42 °C for 30 s and 50 °C for 1 s. To inactivate the enzyme, the mixture was incubated at 85 °C for 5 min. Four reference genes were used for data normalization: U6, actin, α-tubulin and glyceraldehyde 3-phosphate dehydrogenase (GAPDH) (Table S1). For reverse transcription of reference genes, the High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems, Waltham, MA, USA) was used as described earlier. The qPCR reactions were performed using the QuantStudio™ 5 Real-Time PCR System (Thermo Fisher Scientific Inc, Waltham, MA, USA) in a 6 µL final reaction volume containing 0.18 µL of each primer (10 µM), 1.64 µL of nuclease-free water (Integrated DNA Technologies Inc., Coralville, IA, USA), 1 µL of cDNA and 3 µL of 2× Fast SYBR™ Green Master Mix (Thermo Fisher Scientific Inc., Waltham, MA, USA). The conditions for amplification were as follows: an initial step at 95 °C for 20 s, followed by 40 cycles of 95 °C for 15 s and 58 °C for 30 s. The melt curve stage included an initial step at 95 °C for 15 s, a decrease in temperature to 60 °C for 60 s at a rate of 1.77 °C/s, and a subsequent increase to 95 °C for 15 s at a rate of 0.075 °C/s. All the reactions were performed in triplicate with three independent biological replicates.
Relative expression levels of miRNAs were calculated using the 2^−ΔΔCt^ method as described by Livak and Schmittgen [50]. The homogeneity of variances between infected and virus-free samples was evaluated for each miRNA (p > 0.05). The Student t-test was used for comparisons with equal variances, whereas Welch’s test was applied when variances differed. Adjustment for multiple comparisons across all miRNAs was performed using the Benjamini–Hochberg method to control the FDR. Differences were considered statistically significant at FDR-adjusted p-values < 0.05.
3. Results
3.1. Small RNA Sequencing and Viral Status of Samples
Twelve sRNA libraries were sequenced (BioProject PRJNA1380347). Technical issues occurred with library number 5, resulting in a very low number of reads; consequently, this sample was excluded from further analyses. In the remaining eleven libraries, between 6.3 million and 25 million cleaned reads were obtained. In infected samples, 67.34–84.77% and 39.28–76.14% of cleaned reads from sRNA libraries of the cultivars Refošk and Zeleni Sauvignon, respectively, were mapped to the reference grapevine genome (GCF_030704535.1). In virus-free samples, the mapping rates ranged from 70.39% to 83.99% in Refošk and from 72.19% to 94.18% in Zeleni Sauvignon (Table 1).
Small RNA sequencing analysis employing the VirusDetect pipeline [43] confirmed the absence of viruses in the virus-free samples; as expected, two viroids, HSVd and GYSVd-1, were present. In all infected samples, three viruses (GRSPaV, GPGV and GRVFV) and two viroids (HSVd and GYSVd-1) were detected. RT-qPCR was performed to validate sRNA sequencing results. Viruses GPGV, GRSPav and GRVFV were detected in all infected samples. In virus-free samples, no virus was detected using RT-PCR.
Small RNA length distribution was analyzed for both cultivars. In Refošk, 21 nt and 24 nt reads were most abundant in both virus-free and virus-infected samples (Figure 1A). However, in Zeleni Sauvignon, the distribution differed between conditions. In infected samples, 21 nt reads were most frequent, followed by 22 nt reads, whereas in virus-free samples, the 24 nt reads were most abundant, followed by 21 nt reads (Figure 1B).
3.2. miRNA Prediction
Three miRNA prediction tools were used to detect miRNAs in sRNA sequencing samples. The prediction was done only for known miRNAs available in the miRbase database for V. vinifera. In Refošk, 181 miRNAs were detected by miRDeep2 in at least one of six samples, whereas by miRador and ShortStack, 61 and 89 miRNAs were detected, respectively (Table 2 and Table S3). In total, 44 known miRNAs were detected in at least one of six samples by all three tools, and the rest of them were shared between two of the tools or detected only by miRDeep2 (Figure 2A, Table S4). In the cultivar Zeleni Sauvignon, miRDeep2 identified 178 miRNAs in at least one of five samples, while miRador and ShortStack detected 61 and 89 miRNAs, respectively (Table 2 and Table S3). Similarly, 44 miRNAs were commonly detected by all three tools (Figure 2B), of which 41 overlapped with those identified by all three tools in Refošk (Table S4). Across all samples, only four and five miRNAs were present in virus-infected and virus-free samples, respectively. Regarding the overlap of miRNA detection across all analyzed samples, most miRNAs identified in at least one sample were consistently detected in all samples of each cultivar in the ShortStack and miRDeep2 datasets, whereas the overlap among samples was lower for miRador (Table S5).
Seventy-six miRNAs were detected exclusively by miRDeep2 and not by miRador or ShortStack. However, most of these belonged to families that were otherwise identified by all three tools. In addition, nine miRNA families (vvi-miR3625, vvi-miR3626, vvi-miR3628, vvi-miR3630, vvi-miR3636, vvi-miR3637, vvi-miR3639, vvi-miR3640, and vvi-miR845) were detected solely by miRDeep2. In both cultivars, 45 miRNAs were detected by miRDeep2 and ShortStack, but not by miRador, while 17 were detected by miRador and miRDeep2, but not by ShortStack. Most of the miRNAs shared between ShortStack and miRDeep2 belonged to the families that were also detected by miRador (Table S4).
3.3. miRNA Expression Analysis
3.3.1. Analysis of Individual Datasets
In the separate analyses, pre-filtering removed 0, 13 and 9 low-abundance sequences from the ShortStack, miRDeep2 and miRador datasets, respectively. The number of differentially expressed miRNAs in both cultivars varied depending on the tool used. For the cultivar Refošk, ten miRNAs were identified as differentially expressed in miRador and ShortStack datasets, while 36 were identified in the miRDeep2 dataset (Table 3 and Table S6). No miRNA was detected as differentially expressed across all three tools (Figure 3A).
In the cultivar Zeleni Sauvignon, 13 differentially expressed miRNAs were detected in the miRador dataset and the ShortStack dataset, and 44 in the miRDeep2 dataset (Table 3 and Table S7). Two miRNAs (vvi-miR394b and vvi-miR169c) were identified as differentially expressed in the datasets of all three tools (Figure 3B).
3.3.2. Integrated Approach for Differential Expression Analysis
As the overlap in differentially expressed miRNAs identified by individual tools was low, the same datasets were also analyzed using a combined miRNA expression prediction approach integrating the outputs of miRador, ShortStack, and miRDeep2 for downstream analysis. Pre-filtering led to the removal of 13 low-abundance sequences from the datasets.
A heatmap of hierarchical clustering based on 33 sample-to-sample distances showed that the primary driver of clustering before correction was the prediction tool (Figures S1 and S2). After removing this effect, samples clustered predominantly according to biological condition (and cultivar) (Figure 4).
The PCA analysis provides an overview of global variation in miRNA expression across all samples and incorporates the outputs of all three prediction tools. After batch correction, the remaining variation no longer reflects prediction-tool-specific structure, indicating that tool-related biases were effectively minimized. PC1 explains 17% of the total variance and clearly separates infected from virus-free samples in Zeleni Sauvignon. In contrast, Refošk samples do not form well-defined clusters by infection status along PC1 or along PC2, which accounts for an additional 12% of the variance. PC3, explaining 9% of the variance, suggests a possible separation of miRador-predicted samples by infection status along the PC1–PC3 plane (Figure 5).
In the integrated analysis, no miRNAs were differentially expressed in Refošk, whereas 10 miRNAs were differentially expressed in Zeleni Sauvignon at FDR < 0.05. Using a less stringent threshold of p < 0.01 (unadjusted for multiplicity), the integrated approach identified six differentially expressed miRNAs in Refošk, of which two were downregulated and four upregulated. Of these six, two were also detected as differentially expressed in individual analysis of miRador dataset, three in ShortStack dataset and four in miRDeep2 dataset (Table S6, Figure 6).
In Zeleni Sauvignon, the integrated analysis identified 10 differentially expressed miRNAs at FDR < 0.05 and 16 at p < 0.01, of which twelve were downregulated and four upregulated (Table S7, Figure 6). Most of the miRNAs identified as differentially expressed in the integrated dataset were also detected as differentially expressed in at least one individual analysis performed with miRador, ShortStack, or miRDeep2 datasets. Seven differentially expressed miRNAs from the miRador dataset and five from the ShortStack dataset were also identified as differentially expressed by the integrated approach, as well as thirteen miRNAs from the miRDeep2 dataset. For all overlapping miRNAs in both cultivars, the direction of regulation (log_2_FC) was consistent between integrated and individual analyses.
Additionally, we performed an integrated analysis combining samples from both cultivars (virus-infected vs. virus-free). Using this approach, we identified seven differentially expressed miRNAs (Table S8). Each of these miRNAs had also been detected as differentially expressed in at least one of the individual dataset analyses in either cultivar. All seven were detected as differentially expressed by at least one analysis in Zeleni Sauvignon, and four were detected by at least one analysis in Refošk. For all of the predicted differentially expressed miRNAs, the direction of regulation was consistent between the integrated and individual analyses.
3.4. Stem-Loop RT-qPCR Expression of miRNA
Stem-loop RT-qPCR was performed for selected 12 miRNAs in Refošk and 15 miRNAs in Zeleni Sauvignon to compare their expression patterns with sequencing-based predictions (Tables S9 and S10). Among the 12 tested miRNAs in Refošk, three were identified as differentially expressed in the miRador, four in ShortStack, nine in miRDeep2 and four by the integrated approach at p-val < 0.01. Using RT-qPCR, differential expression was observed for 10 miRNAs (vvi-miR399a/h, vvi-miR477a, vvi-miR398a, vvi-miR3635-3p, vvi-miR156b/c/d, vvi-miR160c/d/e, vvi-miR535a/b/c, vvi-miR394a/c, vvi-miR482 and vvi-miR399i) (Figure 7A, Table S9). Six of them were downregulated, and four were upregulated. Eight miRNAs for which qPCR analysis showed differential expression were also predicted as differentially expressed in the analysis of at least one of the datasets, three in the miRador dataset, two in the ShortStack dataset, seven in the miRDeep2 dataset and four in the combined dataset. For most of the differentially expressed miRNAs, the RT-qPCR supported the expression observed in the sRNA-Seq data. However, for vvi-mi160c/d/e, which was upregulated according to ShortStack, miRDeep2 and the combined dataset, RT-qPCR indicated downregulation.
In Zeleni Sauvignon, we chose 15 miRNAs for RT-qPCR analyses, and 13 of them were differentially expressed in miRDeep2, nine in the integrated analysis, six in ShortStack and three in the miRador dataset. Out of 15 miRNAs tested with stem-loop RT-qPCR, five resulted as differentially expressed (vvi-miR3635-3p, vvi-miR171f, vvi-miR156f/g/i, vvi-miR319b/c/f and vvi-miR3624-3p), three of which were downregulated and two upregulated (Figure 7B, Table S10). Of these five, four were predicted in the miRDeep2 dataset and two by integrated analysis at p-val < 0.01 and ShortStack. The remaining tested miRNAs did not show statistically significant differences between infected and virus-free samples. However, all five miRNAs showed the same expression pattern as in sRNA-Seq analysis.
Six miRNAs included in stem-loop RT-qPCR analysis were common to both cultivars (vvi-miR3635-3p, vvi-miR398a, vvi-miR399a/h, vvi-miR394a/c, vvi-miR399i, and vvi-miR482). In Refošk, all six were differentially expressed, whereas in Zeleni Sauvignon, only one miRNA (vvi-miR3635-3p) showed a significantly different expression.
4. Discussion
Grapevine is an economically important fruit crop, and it is host to numerous viruses. Grapevine plants in our study were coinfected with three viruses, GPGV, GRSPaV and GRVFV, each of which is known to have an influence on plant physiology and health. GPGV is a (+) ssRNA virus, belonging to the genus Trichovirus. It was first detected in grapevines in Italy and has since been reported across Europe, North and South America, Asia and Australia. The infection with GPGV can cause leaf chlorotic mottling, deformations, stunted growth, and reduced yield and fruit quality, although many infected plants remain asymptomatic, and symptom severity often depends on the cultivar [51,52,53]. GRSPaV is a member of the genus Foveavirus and is a very common virus in grapevines worldwide. It is associated with Rupestris stem-pitting disease, even though its infections are often asymptomatic or very mild [54]. GRVFV was first described in Greece and has been associated with vein feathering disease. Since its discovery, the virus has been detected in grapevines worldwide [55,56].
In this study, we focused on evaluating how different miRNA prediction tools affect the identification of known miRNAs in small RNA sequencing datasets from two grapevine cultivars, Refošk and Zeleni Sauvignon, both naturally infected with three viruses (GPGV, GRSPaV, and GRSPV). We initially used three widely used tools in miRNA analyses, miRador, ShortStack and miRDeep2 [38,39,40]. All three tools implement profiles of small RNA sequencing data for the prediction of miRNAs, but they differ in algorithms, sensitivity, and emphasis on structural vs. read-pattern features. miRador identifies candidate miRNAs by first detecting inverted-repeat structures in the reference genome, whereas miRDeep2 first maps small RNA reads to the genome and subsequently reconstructs potential precursor sequences based on read alignment patterns. In contrast, ShortStack is designed as a comprehensive small RNA annotation tool that primarily relies on clustering of small RNA reads to define small RNA-producing loci, including miRNAs. Such differences in analytical strategy substantially influence which miRNAs are detected and how they are quantified [36,38,39,40].
The number of detected miRNAs differed considerably across the tools. Among 186 mature miRNAs annotated in miRBase 22.1 for V. vinifera, miRDeep2 identified 181 (97%) miRNAs in Refošk and 178 (96%) in Zeleni Sauvignon, miRador identified 61 miRNAs in each cultivar and ShortStack identified 89 miRNAs (Figure 2, Table S3). For both cultivars, most of the detected miRNAs were either shared among all three tools or were identified only by miRDeep2. The differences in the number and overlaps of miRNAs detected by miRador, ShortStack and miRDeep2 show how strongly the choice of miRNA prediction tool influences miRNA detection results.
The largest differences in miRNA detection were observed for miRNAs that share the same mature sequence but originate from different MIR genes (e.g., vvi-miR156b/c/f and vvi-miR160b/c/d, Table S11). Such miRNAs are particularly challenging for miRNA prediction algorithms, as their mature sequences are identical, and only their precursor sequences can indicate their origin. ShortStack and miRador distinguish between loci of origin for identical mature miRNAs based on their precursor sequences, whereas miRDeep2 typically reports the same number of counts for all such miRNAs. For example, in the case of vvi-miR160c/d/e, ShortStack identified vvi-miR160e as the one with the highest expression level, while miRDeep2 reported a similar number of counts for all three miRNAs (vvi-miR160c/d/e). These differences can substantially affect the expression level of predicted miRNAs.
Similar observations were reported by Pawlina-Tyszko and Szmatoła [57], who benchmarked multiple miRNA identification tools and found that results can vary between algorithms. They reported that certain miRNAs were consistently identified by all tested prediction tools, while others were detected by only one or two. Comparable trends were also reported in another comparative study of plant miRNA prediction tools, which observed a subset of miRNAs consistently identified across tools and a number of tool-specific miRNA predictions [58]. As in this study, Pawlina-Tyszko and Szmatoła [57] used default analysis settings for each tool; however, they noted that adjusting these parameters could potentially improve their performance. In our analysis, we tested modified parameters for the miRador tool, but the changes in the number and identity of predicted miRNAs were minimal, suggesting that differences in algorithms between tools are the main reason for the observed variability.
Differential expression analysis revealed some differences between infected and virus-free samples; however, the number of differentially expressed miRNAs varied considerably depending on which miRNA prediction tool results were used (Figure 3). For both cultivars, the largest number of differentially expressed miRNAs was obtained from the miRDeep2 dataset, while analyses based on the miRador and ShortStack datasets resulted in fewer differentially expressed miRNAs. Considering overlaps between the tools, no miRNA was detected as differentially expressed in the datasets of all three tools in cultivar Refošk, while in cultivar Zeleni Sauvignon, only two miRNAs showed such consistent differential expression. Because of this limited overlap among the tools, no miRNAs were identified as differentially expressed in the integrated analysis in Refošk at FDR < 0.05. Therefore, we additionally used a less stringent threshold of an unadjusted p-val of 0.01. Even under this relaxed criterion, only six differentially expressed miRNAs were detected in Refošk, which is fewer than those detected in individual dataset analyses. Notably, all six miRNAs were also differentially expressed in at least one of the individual analyses. For Zeleni Sauvignon, the integrated analysis identified a number of differentially expressed miRNAs (10 at FDR < 0.05 and 16 at p-val 0.01) that were comparable to the results from the separate analysis of the miRador and ShortStack datasets, but substantially lower than the number detected using miRDeep2 (Table S7). Seven miRNAs from the miRador dataset, five from ShortStack and thirteen from the miRDeep2 dataset overlapped with those identified by the integrated approach at p-val < 0.01. Importantly, two miRNAs were identified as differentially expressed only in the combined dataset, suggesting that an integrated approach can enhance the sensitivity of miRNA analysis.
The limited overlap in differentially expressed miRNAs identified by individual prediction tools might reflect both algorithmic sensitivity and the inherent biological complexity of miRNA-mediated regulation under mixed viral infections. Viral coinfections are known to induce complex changes in miRNA expression [59,60], which may be captured differently by tools with varying sensitivity and stringency. Sensitivity-driven miRDeep2 detects the largest number of miRNAs, including many unique ones; miRador, as an intermediate, leaning stringency-driven, detects a moderate number with some unique predictions but fewer than miRDeep2; and stringency-driven ShortStack detects the fewest unique miRNAs, mostly overlapping with miRDeep2. In this context, miRNAs detected by multiple tools likely represent more robust components of antiviral response. By integrating results across multiple prediction approaches, our analysis aimed to emphasize biologically consistent miRNA expression patterns while acknowledging the complexity of host–viral interactions.
To further explore the impact of dataset integration, we performed an analysis combining samples from both cultivars (virus-infected vs. virus-free). Using this approach, seven differentially expressed miRNAs were identified, all of which were detected in at least one of the individual dataset analyses in either cultivar. All seven were detected in Zeleni Sauvignon, while four were detected in Refošk. For all seven miRNAs, the direction of regulation was consistent between the integrated and individual analyses, indicating that increasing sample size through integrated analysis can enhance statistical power in detecting differentially expressed miRNAs.
Stem-loop RT-qPCR was used to experimentally validate the expression patterns of selected miRNAs and to compare them with sequencing-based analyses. For robust candidate selection, we defined a set of miRNAs that were either detected as differentially expressed in at least one of three individual analyses or by the integrated approach or not identified as differentially expressed in any dataset. In cultivar Zeleni Sauvignon, differential expression was confirmed for five out of 15 tested miRNAs, three of which were downregulated (vvi-miR3635-3p, vvi-miR171f and vvi-miR319b/c/f) and two upregulated (vvi-miR156f/g/i and vvi-miR3624-3p). These results are consistent with those of sequencing-based analysis. Most of the miRNAs, differentially expressed in RT-qPCR, were among those predicted as differentially expressed in the miRDeep2 dataset, although not all miRNAs identified as differentially expressed by miRDeep2 showed differential expression in the RT-qPCR analysis. Similar partial agreement between sRNA sequencing and RT-qPCR has been reported in other plant studies, including analyses of miRNA expression during seed development in Phaseolus vulgaris and stress responses in Solanum lycopersicum, where only a subset of sequencing-identified miRNAs were confirmed by RT-qPCR. Comparable discrepancies were also observed in Cymbidium ensifolium, where only seven of twelve tested miRNAs identified as differentially expressed by sRNA sequencing were confirmed by RT-qPCR [61,62,63].
In cultivar Refošk, the expression of 12 miRNAs was assessed by stem-loop RT-qPCR, and differential expression was confirmed for 10 of them. Interestingly, one miRNA (vvi-miR160c/d/e) showed opposite regulation between the sequencing and qPCR datasets. Sequencing data suggested upregulation, whereas the qPCR data indicated downregulation. Similar inconsistencies have been reported in previous studies, potentially arising from methodological differences as well as biological factors [34]. For the remaining differentially expressed miRNAs, qPCR results support the sequencing data. Specifically, five tested miRNAs (vvi-miR399a/h, vvi-miR535a/b/c, vvi-miR3635-3p, vvi-miR399i and vvi-miR156b/c/d) were downregulated, while four were upregulated (vvi-miR477a, vvi-miR394a/c, vvi-miR482 and vvi-miR398a).
The limitation of the stem-loop qPCR approach is that it cannot distinguish between miRNAs with identical mature sequences derived from different MIR genes [64,65]. This can be observed for vvi-miR160c/d/e, which share the same mature sequences. As a result, qPCR amplifies these three members of the miR160 family together, whereas sequencing-based analysis can differentiate individual precursors. There are also other examples, e.g., vvi-miR399a/h and vvi-miR319b/c/f, where mature sequences are identical, but precursor sequences differ (Table S11). Although prediction tools such as miRador and ShortStack can occasionally separate these family members based on the precursor sequence, this is not always possible. For example, vvi-miR535a and vvi-miR535b share identical precursor sequences, which prevents both qPCR and computational approaches from reliably distinguishing between them. These methodological and biological factors may in part explain why some miRNAs are differentially expressed based on sequencing data but are not confirmed as differentially expressed by qPCR.
Comparison with previous studies revealed that several of the differentially expressed miRNAs identified in our study exhibited expression patterns consistent with earlier reports on grapevine–virus interactions. For example, for four differentially expressed miRNAs in Refošk, we found patterns consistent with earlier reports on grapevine–virus interactions. Singh et al. [35] showed that members of the miR399 family were downregulated in grapevine in response to grapevine vein clearing virus (GVCV). Similarly, in our study, vvi-miR399a/h and vvi-miR399i were downregulated in infected plants. The upregulation of vvi-miR394a/c in our dataset is consistent with the findings reported by Singh et al. [35] and Pantaleo et al. [34], where the expression of the miR394 family was upregulated in response to GVCV or GRSPaV. Another example is vvi-miR156b/c/d, which were downregulated in our dataset, similarly to the study of Alabi et al. [66], who reported downregulation of the miR156 family upon infection with GLRaV-3.
Interestingly, unlike miR156a/b/c, which were downregulated in Refošk and in the reports of Singh et al. [35], miR156f/g/i was upregulated in Zeleni Sauvignon according to both sequencing and qPCR analyses. Such inconsistencies are not unexpected, since different members of the same miRNA families can be differentially regulated in response to viral infection [66]. Although both groups belong to the miR156 family, vvi-miR156b/c/d and vvi-miR156f/g/i differ by three nucleotides in their mature sequences (UGACAGAAGAGAGUGAGCAC vs. UUGACAGAAGAUAGAGAGCAC), which may affect their target specificity and potentially distinct regulation.
Additionally, in Zeleni Sauvignon, vvi-miR3624-3p was upregulated, which is consistent with the findings of Singh et al. [35] and Pantaleo et al. [34]. Nevertheless, it is important to emphasize that both studies analyzed grapevine responses to a single viral infection, while our research focuses on coinfection with three viruses. These results suggest that some miRNAs may respond consistently to viral infection, regardless of the virus, while others show cultivar-specific or virus-specific responses. Of six miRNAs included in stem-loop RT-qPCR analysis common to both cultivars, only one miRNA (vvi-miR3635-3p) was consistently differentially expressed, highlighting the complexity of miRNA regulation and its potential contribution to the variability of grapevine responses to viral infections.
Cultivar-specific regulation is further supported by differences in small RNA length distribution of total reads in sRNA libraries of Refošk and Zeleni Sauvignon (Figure 1). In Refošk, the relative abundance of 21 and 24 nt sRNAs remained largely stable between virus-free and virus-infected samples, while Zeleni Sauvignon displayed a clear condition-dependent shift, with virus-free samples dominated by 24 nt sRNAs, whereas infected samples showed enrichment of 21 nt sRNAs. This suggests more dynamic reprogramming of sRNA pathways in Zeleni Sauvignon compared to a more stable response in Refošk. Such observations in a shift in the profile of endogenous small RNAs from 24 nt being predominant in healthy plants to 21 nt in infected plants were reported in wheat [67] and Arabidopsis [68].
Beyond differences in small RNA size profiles, cultivar-specific miRNA responses likely reflect deeper variation in genetic background and antiviral defense strategies. Grapevine cultivars differ substantially in their regulation of immune-related pathways, including the transcription and processing of MIR genes, as well as the activity of Dicer-like proteins [33,35]. Such variation can influence both the magnitude and flexibility of miRNA-mediated regulation during viral infection. In this context, the relatively stable miRNA landscape observed in Refošk may indicate a more constitutive or tolerant regulatory strategy, whereas the broader reprogramming detected in Zeleni Sauvignon suggests a more dynamic response to viral coinfection. Together, these observations suggest that Refošk and Zeleni Sauvignon engage distinct miRNA-mediated regulatory strategies when responding to viral coinfection, which may contribute to cultivar-dependent differences in physiological responses. In grapevine, GPGV infection can result in variable symptom expression, with many infections remaining latent [52]. In Slovenia, the disease seems to be spreading in the Primorska region, where it is causing considerable economic losses, but it was also found in other regions of Slovenia, and among the most affected cultivars was Zeleni Sauvignon, while there are no reports about symptoms for cultivar Refošk [69], indicating cultivar-dependent responses to infection. This is consistent with reports that GPGV symptom development depends on both viral variant and host genotype [52,70], and it complements our observation of distinct miRNA regulatory patterns between Refošk and Zeleni Sauvignon under viral coinfection. Several of the differentially expressed miRNAs identified here are known regulators of processes that are central to plant–virus interactions, supporting the biological relevance of the observed expression patterns. For example, miR399 is involved in phosphate homeostasis and nutrient allocation, which are processes that are frequently altered during viral infections [71,72]. miR398 regulates Cu/Zn superoxide dismutase expression and plays an important role in oxidative stress response, which are closely linked to antiviral defence [73,74]. Likewise, miR156 and miR319 are well-known regulators of plant development that also participate in stress- and defense-associated networks [75,76,77,78]. Members of the miR482 family are widely recognized for targeting NLR-type immune receptors, suggesting a connection between their differential expression and modulation of antiviral immunity during viral coinfection [79,80].
In the context of these cultivar-dependent responses, it is important to consider that the grapevine plants in our study were naturally infected with a mixed infection of viruses GPGV, GRSPaV and GRVFV. Among these, GPGV causes the most severe symptoms and economic losses. To date, the effect of GPGV infection on miRNA expression has not been analyzed. Although studying single GPGV infection could provide a better understanding of how this virus influences miRNA regulation and plant–virus interactions at the sRNA level, it is possible that the impact of single and mixed infections might not differ substantially. Similar observations were reported by Islam et al. [81], where no major differences in miRNA expression between single and mixed infections of tomato spotted wilt virus (TSWV) and impatiens necrotic spot virus (INSV) in Nicotiana benthamiana were found. However, it is important to mention that these two viruses are closely related, whereas the viruses in our study are taxonomically distinct, which may lead to more complex interactions. Further studies focusing on single infections could therefore provide valuable insights into the specific effect of each virus on miRNA regulation in grapevine.
5. Conclusions
This study demonstrates that miRNA identification and differential expression analysis in grapevine are strongly influenced by the choice of bioinformatic prediction tool. By comparing three widely used miRNA prediction pipelines—miRador, ShortStack, and miRDeep2—we showed that substantial differences exist not only in the number of detected miRNAs but also in the sets of miRNAs identified as differentially expressed in response to viral coinfection. These discrepancies directly affected downstream biological interpretation and limited the overlap of results obtained from individual analyses.
To address this variability, we implemented an integrated analytical framework that combined the outputs of all three tools while accounting for tool-specific biases. Although this approach reduced statistical power under stringent multiple-testing correction, it increased confidence in the detected signals by prioritizing miRNAs supported by more than one analytical strategy. Importantly, most miRNAs identified as differentially expressed in the integrated analysis were also detected in at least one individual dataset, and their direction of regulation was consistent across approaches. Experimental validation using stem-loop RT-qPCR further supported the expression trends of several miRNAs, underscoring the value of combining computational and experimental methodologies.
From a biological perspective, our results provide insight into miRNA-mediated regulation in grapevine under mixed-virus infection, revealing both cultivar-specific and shared responses. The limited overlap of differentially expressed miRNAs between cultivars highlights the complexity of miRNA regulation in perennial crops and suggests that host genetic background plays a major role in shaping miRNA responses to viral stress. Overall, this work emphasizes that miRNA profiling based on a single prediction tool may provide an incomplete or biased view of miRNA expression dynamics. Integrated bioinformatic analyses, together with targeted experimental validation, represent a more robust strategy for characterizing miRNA regulation in complex biological systems such as virus-infected grapevine. These findings are relevant not only for grapevine virology but also for broader applications of small RNA sequencing in plant stress biology.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Liu H. Yu H. Tang G. Huang T. Small but Powerful: Function of Micro RN As in Plant Development Plant Cell Rep.20183751552810.1007/s 00299-017-2246-529318384 · doi ↗ · pubmed ↗
- 2Wang C. Shangguan L. Kibet K.N. Wang X. Han J. Song C. Fang J. Characterization of Micrornas Identified in a Table Grapevine Cultivar with Validation of Computationally Predicted Grapevine Mi RN As by Mi R-RACEP Lo S ONE 20116 e 2125910.1371/journal.pone.002125921829435 PMC 3145640 · doi ↗ · pubmed ↗
- 3Wang J. Mei J. Ren G. Plant Micro RN As: Biogenesis, Homeostasis, and Degradation Front. Plant Sci.20191036010.3389/fpls.2019.0036030972093 PMC 6445950 · doi ↗ · pubmed ↗
- 4Xu Y. Chen X. Micro RNA Biogenesis and Stabilization in Plants Fundam. Res.2023370771710.1016/j.fmre.2023.02.02338933298 PMC 11197542 · doi ↗ · pubmed ↗
- 5Zhou L. Liu Y. Liu Z. Kong D. Duan M. Luo L. Genome-Wide Identification and Analysis of Drought-Responsive Micro RN As in Oryza Sativa J. Exp. Bot.2010614157416810.1093/jxb/erq 23720729483 · doi ↗ · pubmed ↗
- 6Liu H.H. Tian X. Li Y.J. Wu C.A. Zheng C.C. Microarray-Based Analysis of Stress-Regulated Micro RN As in Arabidopsis Thaliana RNA 20081483684310.1261/rna.89530818356539 PMC 2327369 · doi ↗ · pubmed ↗
- 7Kantar M. Lucas S.J. Budak H. Mi RNA Expression Patterns of Triticum Dicoccoides in Response to Shock Drought Stress Planta 201123347148410.1007/s 00425-010-1309-421069383 · doi ↗ · pubmed ↗
- 8Wang B. Sun Y.F. Song N. Wei J.P. Wang X.J. Feng H. Yin Z.Y. Kang Z.S. Micro RN As Involving in Cold, Wounding and Salt Stresses in Triticum aestivum L.Plant Physiol. Biochem.201480909610.1016/j.plaphy.2014.03.02024735552 · doi ↗ · pubmed ↗
