Small RNA sequencing analysis identified miR159a as a novel candidate for activity in plant-derived nanovesicles from limon, hassaku, and sudachi
Hideki Takakura, Shingo Miyamoto, Tetsushi Yamamoto, Toshimasa Nakao, Atsushi Taga, Michihiro Mutoh, Keisuke Oda

TL;DR
This study identifies miR159a as the most abundant microRNA in nanovesicles from three citrus plants, suggesting it may play a key role in their biological effects.
Contribution
The study reports the first identification of miR159a as the most abundant miRNA in citrus-derived nanovesicles.
Findings
A total of 158 miRNAs were identified in citrus-derived nanovesicles, including 109 known and 49 novel miRNAs.
miR159a was the most abundantly expressed miRNA across all citrus species tested.
Abstract
Owing to their safety, plant-derived nanovesicles (PDNVs), particularly those derived from edible plants, are expected to be natural delivery carriers for microRNAs (miRNAs). Among PDNVs, Citrus limon-derived extracellular vesicle-like nanovesicles have been shown to exert antitumor effects. We previously reported that PDNVs derived from C. limon L. exert selective inhibitory effects on the proliferation of p53-inactivated colon cancer cells. However, miRNAs in C. limon L.-derived nanovesicles have not yet been identified, and it remains unclear which miRNAs are contained in these PDNVs. We herein extracted PDNVs from C. limon L., C. hassaku, and C. sudachi and identified and compared the miRNAs present within them via next-generation sequencing. A total of 158 miRNAs were identified, of which 109 were known miRNAs and 49 were novel miRNAs. Comparisons of miRNAs expressed by C. limon…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9- —the Scholarship Fund for Young/Women Researchers from The Promotion and Mutual Aid Corporation for Private Schools of Japan
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExtracellular vesicles in disease · MicroRNA in disease regulation · RNA Interference and Gene Delivery
Introduction
Extracellular vesicles (EVs) are stable carriers of microRNAs (miRNAs) as bioactive cargo for intercellular communication. EVs are broadly defined as membrane-bound particles released from cells that cannot replicate and are distinguished from free macromolecules; consensus definitions and recommended characterization criteria are provided by the MISEV2023 guidelines^1^. In 2007, EVs were shown to encapsulate miRNAs^2^. Furthermore, in 2010, miRNAs were found to move between cells via EVs and have functions for recipient cells^3^. A recent study reported the secretion of extracellular vesicle-like nanovesicles from plant cells through the cell wall and, thus, they have been attracting increasing attention as carriers of interspecies communication^4^. The interspecies functions of plant miRNAs have been demonstrated, and plant-derived miRNA sequences have been detected in mammalian blood^5^. Plant-derived nanovesicles (PDNVs) were shown to be stable in the intestines, and diet is considered to be a route for the uptake of plant miRNAs^6^. However, limited information is currently available on the miRNAs present in PDNVs, and the effects of plant-derived miRNAs on humans remain largely unknown.
miRNAs are short noncoding RNAs of approximately 21 to 25 bases, and are a representative group of small RNAs that play a role in the endogenous RNA silencing mechanism^7,8^. miRNAs are involved in the regulation of a number of cellular processes, including proliferation, differentiation, immunity, and metabolism^7^. In recent years, plant miRNAs have been detected in mammalian blood and are suspected to be related to diseases. Western donor serum was found to contain plant miR159a, and its abundance in serum extracellular vesicles inversely correlated with the incidence and progression of breast cancer in patients^9^. Furthermore, miR159a was shown to inhibit the proliferation of breast and colorectal cancer cells by targeting the TCF7 gene and suppressing the expression of the oncogene MYC, which is downstream of the Wnt signaling pathway^9,10^. In addition, miR159a was found to exert anti-inflammatory effects by targeting TNF-α receptors^11^ and functioned as a regulator of cellular cholesterol efflux in vitro^12^. However, it currently remains unknown whether miR159a is present in PDNVs derived from citrus fruits, including lemons.
PDNVs, particularly those derived from edible plants, are expected to be natural delivery carriers for miRNAs because of their safety. Previous studies have examined the effects of PDNVs on intestinal bacteria^13^, the intestinal epithelium^6^, and immune cells^14^. miRNAs in PDNVs have been shown to exert antitumor^15^, anti-inflammatory^11^, and immunostimulatory effects^16^ and are being investigated for medical applications and as previously unknown plant bioactive substances. Among PDNVs, Citrus limon L.-derived nanovesicles reportedly exert antitumor effects^17^, LDL receptor-reducing effects^18,19^, antioxidant effects^20,21^, increased tolerance of lactobacilli to bile^22^, and inhibitory effects on Clostridioides difficile infections^23^ and have been used to treat Alzheimer’s disease^24^.
We previously demonstrated that PDNVs derived from C. limon L. were taken up by p53-inactivated colon cancer cells via the macropinocytosis pathway and selectively inhibited cell proliferation^25^.
The active component for these effects may be bioactive miRNAs within PDNVs, such as miR159a; however, the miRNAs present in PDNVs derived from citrus fruits, such as C. limon L., have yet to be identified. The selection of C. limon L., C. hassaku, and C. sudachi was based on their phylogenetic proximity and regional cultivation patterns. This enabled a comparative analysis of PDNVs miRNA expression profiles aimed at distinguishing species-specific miRNA signatures from those conserved across citrus species. Therefore, we herein extracted PDNVs from C. limon L., C. hassaku, and C. sudachi and identified and compared the miRNAs present within them via next-generation sequencing.
Results
Isolation and identification of nanovesicles from C. limon L., C. hassaku, and C. sudachi
We cut C. limon L., C. hassaku, and C. sudachi into pieces and made juice from the edible parts (Fig. 1A). Nanovesicles were isolated from C. limon L., C. hassaku, and C. sudachi juices via ultracentrifugation. The size distribution and concentration of nanovesicles determined via nanoparticle tracking analysis (NTA) are shown in Fig. 1B. The results revealed that the diameters of citrus fruit-derived nanovesicles (CDNVs) ranged between 100 and 200 nm.
Fig. 1(A) Photographs of C. limon L., C. hassaku, and C. sudachi from which juice was extracted. A 1‑cm scale bar is shown. (B) Size distribution of nanovesicles. Size distributions were analyzed by nanoparticle tracking analysis. (C) RNA in the nanoparticles analyzed using the Agilent 2100 Bioanalyzer. Molecular weight markers (in nucleotides (nts)) were indicated at 25 nts. A trace showing the amount of RNA versus size in nts is shown.
CDNVs were extracted from fruit juice at the same concentration ratio (80-fold). We then measured the protein and RNA concentrations in the nanovesicles. The nanoparticle concentration was the highest in C. limon L. and the lowest in C. sudachi (Table 1). The protein quantification results were not consistent with the nanoparticle concentrations. In contrast, the total RNA concentrations were in accordance with the nanoparticle concentrations (Supplementary Fig. 1). We confirmed the presence of detectable RNA from all CDNVs via a bioanalyzer (Fig. 1C). On the basis of these results, we confirmed that the three types of CDNVs obtained had similar particle sizes to those of typical extracellular vesicles derived from mammals and retained detectable RNA.
Table 1Citrus-derived nanovesicle concentrations and components.
Analysis of sequences from three libraries
We generated a library for each of the three CDNVs to investigate the miRNAs they contained. After sequencing via Illumina NovaSeq 6000 with single-end 50 bp (SE50) reads, we removed low-quality reads and adapter-contaminated sequences. High-throughput sequencing generated 12,348,390 reads (96.09%) for C. limon L., 12,013,778 reads (98.00%) for C. hassaku, and 10,665,932 reads (97.80%) for C. sudachi. To control the quality of the libraries, we examined the error rate and GC content of the sequencing results, which are shown in Supplementary Tables s1 and 2. The error rate distribution was also established (Supplementary Fig. 2).
We also analyzed the number and distribution of small RNA (18 ~ 30 nucleotide (nt)) reads. The number of small RNA reads for each sample was 4,926,724 for C. limon L., 7,756,137 for C. hassaku, and 4,208,646 for C. sudachi. The read length distributions of the three CDNVs revealed that the majority of the reads were 19–24 nt in size (Fig. 2), with additional peaks at 21 or 24 nt.
Fig. 2. Distribution ratios of small RNA lengths in the three libraries.
Small RNAs are mapped to a reference genome and analyzed
Since sequence information for C. limon L., C. hassaku, and C. sudachi is not publicly available, small RNAs were mapped to the sequence information registered in miRbase22 and the Kyoto Encyclopedia of Genes and Genomes (KEGG) of the closely related species C. sinensis (NCBI taxonomy ID: 2711). The mapping rate of small RNAs in all the libraries exceeded 80%, indicating that the sequencing data were of sufficient quality for downstream analyses. The mapped small RNAs were then unambiguously assigned in the following order of priority: known miRNA > rRNA > tRNA > snRNA > snoRNA > repeats > genes > NAT-siRNA > ta-siRNA > novel miRNA. Less than 60% of the rRNA in the plant samples and less than 40% of the rRNA in the animal samples generally indicated high quality. For all three libraries, the rRNA ratio was < 60%, and high-quality RNA was recovered and analyzed. These data are summarized in Table 2.
Table 2. Numbers of reads for each small RNA classification identified.
Known miRNAs
Matched small RNAs were initially checked against known RNAs. A total of 14,376,407 reads were measured for known miRNAs (across all the libraries), with 11,308 reads (0.23%) for C. limon L., 15,490 reads (0.20%) for C. hassaku, and 6,668 reads (0.16%) for C. sudachi. A total of 109 known miRNA sequences were identified, of which 79 were identified from C. limon L., 91 from C. hassaku, and 72 from C. sudachi. Among all the libraries, miR159a had the highest number of reads, and the same results were obtained when it was converted to TPM (Fig. 3A, Supplementary Fig. 3A).
Fig. 3. Distribution of member numbers and read count of each known miRNA (A) and novel miRNA (B) in the three libraries.
Novel miRNAs
A total of 8,420 reads were measured for novel miRNAs (across all the libraries), with 3,375 reads (0.07%) for C. limon L., 2,905 reads (0.04%) for C. hassaku, and 2,140 reads (0.05%) for C. sudachi. A total of 49 novel miRNA sequences were identified, of which 39 were identified from C. limon L., 42 from C. hassaku, and 25 from C. sudachi (Fig. 3B, Supplementary Fig. 3B).
In this study, we analyzed miRNAs (known and novel miRNAs) and provided detailed information for future studies, including nucleotide sequence information of rRNA, tRNA, snRNA, snoRNA, repeats, genes, NAT-siRNA, and ta-siRNA as well as the structures of precursor miRNAs (Additional Information Files).
Comparative analysis of three types of libraries containing known and novel miRNAs
The expression of known and novel miRNAs in each library was compared and is shown in Fig. 4. We identified 158 known or novel miRNAs, 109 known miRNAs, and 49 novel miRNAs in the three CDNVs. We identified 77 known or novel miRNAs expressed by all three CDNVs, of which 58 were classified as known miRNAs and 19 as novel miRNAs. Comparisons of the two species revealed distinct miRNA expression patterns (Supplementary Fig. 4).
Fig. 4. Venn diagram of known and novel miRNAs identified in nanovesicles derived from three types of citrus fruits.
We compared the expression levels of known and novel microRNAs in the three libraries and listed the top 10 (Table 3). Among all the libraries, miR159a-3p was the most highly expressed and was more than twice as highly expressed as the second most highly expressed. The top expressed microRNAs were common among all the libraries. The structural features of the miRNAs were visualized by predicting the secondary structure of MiR159a-3p (Fig. 5).
Table 3. Number of reads and sequences of the top 10 known or novel microRNAs with the highest read counts identified in the three libraries.
Fig. 5. The predicted hairpin structures. The figures illustrate the predicted hairpin structures of csi-MIR159a, and the red parts reflect mature sequences.
We created a heatmap of the miRNA sets based on log2-transformed TPM values, sorting them by the average expression level of all expressed microRNAs and the 77 commonly expressed microRNAs (supplementary Fig. 5). This heatmap visualization showed that the three species share considerable similarities in their miRNA expression profiles, while also exhibiting distinct expression patterns between species.
Comparisons of the two species were performed via three libraries (Limon vs. Hassaku, Limon vs. Sudachi, and Hassaku vs. Sudachi) containing known and novel miRNA expression levels. Spearman’s correlation ρ values between the three libraries were 0.60–0.65 across all samples, with similar correlations and no pairs being homologous (Supplementary Table 3).
We focused on microRNAs showing large interspecies differences in expression and calculated log2TPM fold changes for each pairwise comparison. To ensure reproducibility, microRNAs with fewer than 100 reads were excluded, and the top 10 microRNAs with the largest log2TPM fold changes were extracted and are presented in Supplementary Table 4.
The top 10 microRNAs exhibited marked interspecies variation in expression. Notably, novel_63 was highly expressed specifically in Hassaku, whereas members of the csi-miR167 and csi-miR482 families presented preferential expression in Limon. These microRNAs may be involved in species-specific physiological traits or stress responses and represent promising candidates for future functional studies.
As a result of the miRNA family analysis, the same number of miRNA families were identified in Glycine max (soybean) and Solanum tuberosum (potato) as in the reference genome (Supplementary Fig. 6). The results of the miRNA family analysis have focused mainly on the study of miRNA gene families. miRNAs belonging to the same family generally have similar regulatory functions, and analyzing these families (clusters) will help elucidate the cooperative regulatory mechanisms of multiple miRNAs. Although not all the identified miRNAs are listed in the database yet, the relationships between the above two plants and the three types of citrus fruits may become clear in the future.
Differences in the expression levels of known and novel miRNAs between each library and their target genes were analyzed. These results may be important for clarifying the role of PDEVs in future plant research.
C. limon L., C. hassaku, and C. sudachi-derived nanovesicles show inhibitory effect of human colorectal cancer cell proliferation
We previously demonstrated that nanovesicles derived from C. limon L. suppress the proliferation of human colorectal cancer cells^25^, suggesting that citrus‑derived nanovesicles may contain components with conserved anti‑proliferative activity. Notably, several microRNAs were abundantly preserved across all three citrus species, raising the possibility that shared miRNAs may contribute to common biological effects. To further explore this possibility, we extended our analysis to nanovesicles isolated from C. hassaku and C. sudachi. This approach allowed us to examine whether the inhibitory effect observed in C. limon L. is a species‑specific property or a broader characteristic shared among CDNVs. Consistent with this hypothesis, nanovesicles from all three species inhibited colorectal cancer cell proliferation in a dose‑dependent manner, although the degree of inhibition varied among species (Fig. 6). The observation that all three CDNVs exhibited similar inhibitory activities supports the possibility that commonly preserved miRNAs may underlie this shared effect.
Fig. 6. Inhibitory effects of citrus-derived nanovesicles (CDNVs) from limon, hassaku, and sudachi on the proliferation of human colorectal cancer cells. SW480 cells were treated with the indicated concentrations of CDEVs for 96 h. The data are presented as the means ± standard errors; n = 3; *, P < 0.05, relative to the control. P values were calculated via one-way ANOVA followed by Dunnett’s post hoc test.
Discussion
The present results revealed the miRNAs present in PDNVs from three types of citrus fruits: C. limon L., C. hassaku, and C. sudachi. Previous studies reported that the diameter of PDNVs from C. limon L. was approximately 100 nm^21^. Furthermore, the majority of studies on PDNVs have shown that the diameter ranges between 100 and 200 nm. Since the number of CDNVs obtained in the present study was proportional to the amount of RNA, we initially speculated that RNA might represent a major component of these preparations. However, because we did not perform nuclease protection assays or membrane‑permeabilization controls, we cannot determine whether the detected RNA is encapsulated within vesicles or is externally associated. Therefore, the conclusion that RNA is a major internal component cannot be rigorously supported. In addition, because RNase ± detergent assays were not performed, the present study cannot determine whether the detected RNA is intravesicular or externally associated. PDNVs may deliver plant-derived small RNAs to mammalian cells; however, the present study did not determine whether these RNAs are intravesicular or externally bound. Thus, any assumption regarding RNA stability or delivery mechanisms should be interpreted with caution. We previously imaged nanoscale vesicles from citrus‑derived fractions via transmission electron microscopy and demonstrated that their diameters are approximately 100–200 nm^25^. However, definitive characterization of these fractions as extracellular vesicles (EVs) requires additional analyses, such as immunoblotting for established plant EV markers and proteomic profiling^26^. Because citrus species such as lemon are phylogenetically distant from model plants such as Arabidopsis, antibody cross‑reactivity and the coverage of sequence databases are currently limited; therefore, the available evidence for EV identity remains constrained. In addition, although our previous work confirmed the EV‑like morphology of citrus‑derived nanovesicles via transmission electron microscopy, we did not perform TEM analysis of the specific preparations used in the present study. Consequently, we cannot determine whether the nanovesicles analyzed here possess the same structural characteristics. Taken together, these limitations indicate that the vesicles characterized in this study should be interpreted as EV‑like nanovesicles rather than fully validated extracellular vesicles, and this point should be considered when interpreting their biological effects.
The length distribution of RNAs differs between plants and animals. Small RNAs often peak at 21 or 24 nt in plants and at 22 nt in animals. Since the distribution of small RNAs in all the libraries peaked at 21 or 24 nt, the plant-derived small RNAs were successfully recovered. The collation of mapped miRNAs revealed that the known miRNA miR159a was the most abundant of the three libraries. However, comparison of the miRNA expression profiles across the three libraries revealed 77 miRNAs that were commonly expressed, while each library also contained a set of uniquely expressed miRNAs. This result suggests a characteristic expression pattern common to the three types of libraries rather than contamination during CDNVs extraction or next-generation sequencing. Among the identified microRNAs, we focused on miR159a because it presented the highest abundance in all three libraries. Previous studies have shown that miR159 family members regulate transcription factors involved in Wnt signaling, and miR159a in particular has been reported to directly target TCF7 in mammalian cells. Although we did not perform target prediction or functional validation analyses in the present study, these previously reported interactions provide a contextual background for considering how citrus-derived miRNAs, including miR159a, might be linked to biological processes in mammalian cells. Despite these observations, the overall abundance of miRNAs within the vesicle RNA population was extremely low, whereas rRNA fragments dominated the small RNA profiles. This imbalance introduces uncertainty regarding the extent to which miRNAs contribute to the functional properties of citrus-derived nanovesicles. Accordingly, the potential biological roles of these miRNAs should be interpreted with caution, and their low abundance must be considered a key limitation of the present study.
The information obtained on miRNAs in the present study provides promising materials for discussion as an activator of the physiological effects of CDNVs, including C. limon L., in mammals. We previously reported the antitumor effects of CDNVs from C. limon L. on colon cancer cells in which the Wnt pathway is known to be activated^25^. miRNAs encapsulated in PDNVs are thought to act on mammalian cells through multiple mechanisms^27^. Encapsulation in membrane vesicles protects small RNAs from degradation in the gastrointestinal tract and facilitates their uptake into cells by endocytosis or membrane fusion, enabling delivery to distant tissues. Furthermore, endocytic activity has been reported to be affected by food components, which may have a significant effect on the correlation between diet and the incidence of diseases such as colorectal cancer.
Considering that previous studies have reported ‑inhibitory activity of miR159a for the Wnt pathway, it remains possible, although not demonstrated in the present study, that miR159a could be one of several factors contributing to these effects. However, we did not assess Wnt pathway activation or downstream targets in our experimental system; therefore, the involvement of miR159a-mediated Wnt inhibition remains unclear. Thus, the present study does not provide evidence of miR159a delivery to SW480 cells or modulation of Wnt pathway components, and no causal relationship can be inferred. We confirmed that CDNVs derived from C. limon L., C. hassaku, and C. sudachi each dose‑dependently inhibited human CRC cell proliferation. However, the present study did not include nanovesicles‑depleted supernatants or further purified nanovesicles fractions (e.g., density‑gradient or size‑exclusion chromatography); therefore, we cannot exclude the possibility that copurified soluble metabolites contributed to the observed suppression. Although blank controls consisting of CDNVs plus WST‑8 reagent without cells did not show abnormal absorbance, indicating that direct chemical interference with the WST‑8 reaction is unlikely, this does not rule out cytotoxic effects of non‑vesicular components. Furthermore, our previous work quantified the citric acid concentration in C. limon L.‑derived nanovesicle fractions and demonstrated that applying the same concentration of citric acid to cells did not affect proliferation, suggesting that acidic components alone are unlikely to account for the observed effects. Nevertheless, without CDNV‑depleted and gradient‑purified controls, the dose‑dependent suppression cannot be attributed exclusively to extracellular vesicles. These findings are consistent with the possibility that miR159a, among other vesicle components, could contribute to the observed effects, although the present data do not establish a direct link to Wnt pathway regulation.
Wang et al.^28^ analyzed PDNVs from grapefruit, ginger, lemon, and grape and reported that only 15 miRNAs were consistently expressed across these species. They also demonstrated that the top 20 miRNAs accounted for the majority of the total miRNA abundance in each nanovesicles sample. In their lemon-derived EVs, miR3952 was identified as the most highly expressed miRNA, followed by miR159a. In contrast, our study, which focused on three citrus species—C. limon L.,* C. hassaku*, and C. sudachi—revealed 77 commonly expressed miRNAs, with miR159a being the most abundant of all three. This discrepancy is likely attributable to differences in the taxonomic background of the plant species analyzed, which may be the primary factor. However, methodological variations, including nanovesicles isolation protocols, small RNA library preparation conditions, and miRNA annotation standards, may also contribute to the observed differences. Because miRNA abundance was derived from single libraries without biological replicates or spike‑in normalization, the apparent predominance of miR159a may reflect technical biases rather than true biological abundance. Thus, miR159a should be interpreted as the most abundant miRNA within the analyzed datasets, not necessarily across different harvests or citrus cultivars.
On the other hand, in plant production, miR159a has been reported to exert an antidrought effect and be upregulated after drought treatment^29^. Citrus fruits are plant species that are well suited to drought-prone regions, possibly because miR159a was the most highly expressed miRNA in CDNVs from C. limon L., C. hassaku, and C. sudachi.
The dynamics of miRNAs in PDNVs have not yet been clarified in detail. The present results revealed that miR159a was present in the highest amounts in CDNVs from C. limon L., C. hassaku, and C. sudachi; therefore, it may be used as an indicator to clarify the dynamics of miRNAs in PDNVs in the body after their oral ingestion and may also be established as a new nutritional component in the future.
Conclusion
The present results identified miRNAs in PDNVs from three types of citrus fruits. Among the three types of CDNVs, miR159a was the most highly expressed miRNA. This study helps clarify the physiological activities of CDNVs in mammals and identify active substances.
Materials and methods
Preparation of nanovesicles
C. limon L. was purchased from an organic farmer in Ehime, Japan. C. hassaku was a gift from a private farmer in Ehime, Japan. C. sudachi was purchased from an organic farmer in Tokushima, Japan. C. limon L., C. hassaku, and C. sudachi were gently squeezed manually. The obtained juice was centrifuged at 2000×g for 20 min, and the resulting supernatant was centrifuged at 10,000×g for 60 min. The supernatant was filtered through a filter with 0.22-µm pores and centrifuged at 100,000×g for 90 min in a fixed angle rotor P70AT (Eppendorf Himac Technologies Co., Japan). The pellet was suspended in PBS (-) and transferred to a 30% sucrose/D_2_O cushion. After centrifugation at 100,000×g for 180 min in a swinging bucket rotor SW 32.1 Ti (Beckman Coulter Inc., Brea, CA, USA), the nanovesicle-containing fraction was resuspended in PBS (-) and centrifuged at 100,000×g for 90 min two times. The pellet was collected and resuspended in PBS (-) for subsequent experiments.
NTA
NTA was performed as previously described^25^. The samples were diluted 100-fold with PBS (-) and subjected to NTA. NTA was conducted five times to obtain the average value and standard error (Fig. 1B). The measured value was multiplied by the dilution factor to calculate the concentration of nanoparticles in the sample (Table 1).
Protein quantitation
Nanovesicles were dissolved in RIPA buffer (Nacalai Tesque, Japan), and protein quantitation was performed via the TAKARA BCA Protein Assay Kit (Takara, Japan) according to product information.
RNA quantification and qualification
Filgen (Nagoya, Japan) performed RNA isolation and quantitation. Small RNAs were extracted from nanovesicles using the mirVana™ miRNA Isolation Kit with phenol (Thermo Scientific, Waltham, MA) and Plant RNA Isolation Aid (Thermo Scientific, Waltham, MA). RNA quantitation was performed by NanoDrop™ One (Thermo Fisher Scientific). RNA integrity was assessed using the Agilent RNA 6000 Nano Kit (Agilent Technologies, Inc., cat.# 5067 − 1151) with the Agilent Bioanalyzer 2100 system (Agilent Technologies, CA, USA).
Library Preparation and small RNA sequencing
Novogene (Beijing, China) performed library preparation and sequencing analyses using 10 ng of RNA per sample. Sequencing libraries were generated using the Small RNA Lib Prep Kit for Illumina (ABclonal, RK20305) following the manufacturer’s recommendations, and index codes were added to attribute sequences to each sample. Briefly, the 3’ SR adaptor, 5’ end adaptor, and M-MuLV Reverse Transcriptase (RNase H–) were used to synthesize first-strand cDNA. PCR amplification was performed using LongAmp Taq 2× Master Mix, the SR Primer for Illumina, and the index (X) primer. The PCR products were purified on an 8% polyacrylamide gel (100 V, 80 min). Recovered DNA fragments were dissolved in 15 µL of elution buffer, and library quality was checked on the Agilent Bioanalyzer 2100 system via DNA high-sensitivity chips.
Small RNA sequencing was performed via the Illumina NovaSeq 6000 platform with a single-end 50 bp (SE50) read configuration. This platform utilizes sequencing-by-synthesis (SBS) chemistry, which combines bridge amplification and reversible terminator-based fluorescence detection to achieve high-throughput and accurate base calling. The SE50 configuration was selected to match the insert size of the small RNA library (18–45 bp), ensuring optimal coverage and resolution for miRNA profiling.
Data analysis
Raw data
The original fluorescence image files from the Illumina platform were converted into short reads (raw data) via base calling, recorded in the FASTQ format^30^, which includes sequence information and corresponding sequence quality information, and processed through fastp software (v0.23.1). Using this software, clean data (clean reads) were obtained by removing reads containing the adaptor, reads containing poly-N, and low-quality reads from the raw data.
Data quality control
Fastp^31^ (v0.23.1) was used to perform basic statistics on the quality of the raw reads. We discarded paired reads if either read contained adapter contamination, if more than 10% of the bases were uncertain in either read, and if the percentage of low-quality (Phred quality < 5) bases was > 50% in either read.
Mapping to the reference genome
Small RNA tags were mapped to the reference sequence via Bowtie^32^ software. In the mapping step of our analysis pipeline, we used Bowtie version 1.0.1 to align small RNA reads to the reference genome. The alignment was performed via the following command-line parameters: bowtie -v 1 -k 1.
Analysis of known miRNA
Reads that matched reference sequences were compared with sequences in the specified range in miRBase, and small RNA information was obtained from mirdeep2^33^, such as the miRNA secondary structure, miRNA sequence, length, and frequency of occurrence. We used miRDeep2 v0.0.5 and performed alignment to reference miRNA sequences with the mapper.pl script via the parameters -e -h -m -s reads.fa -t reads.arf -p genome.fa.
NcRNA analysis & repeat sequences alignment
We annotated small RNAs via the ncRNA sequences of the species or selected rRNAs, tRNAs, snRNAs, and snoRNAs from RFAM^34^ to annotate small RNAs and identify and remove potential rRNAs, tRNAs, snRNAs, and snoRNAs.
Exon and intron alignment
To exclude sequences of small RNA sequences that were degraded from mRNA fragments, we used Bowtie^32^ software, and known exon and intron sequences were compared, after which sequence information from the comparisons was counted.
Plant NAT-siRNA
These species were not present in the database, and we adopted the analysis methods of PlantNATsDB^35^ for de novo predictions and detected NAT-siRNA.
Plant ta-siRNA
TAS gene analysis was performed on known TAS genes from rice and Arabidopsis as databases for homology comparisons. Moreover, the TAS gene was predicted via the ta-siRNA identification software UEA sRNA tools^36^.
Novel miRNA prediction
To predict new miRNAs, we used miREvo^37^ and mirdeep2^33^ to extract small RNA sequences of a specific length and aligned them with the reference genome. The sequences were analyzed for secondary structures, Dicer cleavage site information, energy characteristics, and other features to identify new miRNAs. We used miREvo version 1.1 with the -i -r -M -m -k -p 10 -g 50,000 and performed for integration of miREvo and mirdeep2 for novel miRNA prediction.
Small RNA annotation
A single small RNA may match multiple different annotation categories. Therefore, to ensure that each unique small RNA was assigned to only one annotation, we followed the priority order shown below. known miRNA > rRNA > tRNA > snRNA > snoRNA > repeats > genes > NAT-siRNA > ta-siRNA > novel miRNA.
miRNA base edit
To identify potentially mutated miRNAs, small RNA sequences were aligned from each sample with the known and novel mature miRNAs detected as well as their precursors. miRNAs may undergo nucleotide editing at specific positions, leading to changes in the seed sequence with altered target genes^38^.
miRNA family analysis
Known miRNAs were identified by miFam.dat, and their family origin was examined. Novel miRNAs were classified using RFAM to identify their RFAM family.
miRNA expression and differential expression
TPM^39^ was used to normalize the expression levels of known and novel miRNAs in each sample. TPM= (read count × 1,000,000)/libsize (libsize: total miRNA read count). For samples with biological replicates, we used DESeq2^40^ for differential expression analysis between two comparison groups. DESeq2 detects differential expression in digital gene expression data via a model based on the negative binomial distribution. For samples without biological replicates, we used the edgeR^41^ TMM algorithm to normalize the read count data.
Differential expression was assessed via DESeq2 (v1.24.0) for experiments with biological replicates (padj ≤ 0.05), while experiments without replicates were analyzed via edgeR (v3.24.3) applying padj ≤ 0.05 and |log2foldchange|>=1.
Target gene predictions for known and novel miRNA
psRobot^42^ was used to predict known and novel miRNA target genes to obtain the relationships between miRNAs and their target genes. We used psRobot version 1.2 and performed plant target prediction with the -s -t -o -p 1.
Cell culture and cell proliferation assay
The human colorectal cancer cell line SW480 was purchased from the American Type Culture Collection. The cells were cultured in Roswell Park Memorial Institute 1640 medium supplemented with 10% exosome-depleted fetal bovine serum (System Bioscience, Palo Alto, CA, USA). All the cells were cultured at 37 °C in an atmosphere containing 5% CO_2_. SW480 cells were plated at a density of 2 × 10^3^ cells per well in 96-well plates. After preincubation for 24 h, the cells were treated with CDEVs for 96 h. After treatment, the cells were incubated with WST-8 reagent (Cell Counting Kit-8; Dojindo Laboratories, Kumamoto, Japan) for 4 h at 37 °C, and the optical density of the culture mixture was measured at 450 nm via an ELISA plate reader.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary Figures and Tables
Supplementary Information 1
Supplementary Information 2
Supplementary Information 3
Supplementary Information 4
Supplementary Information 5
Supplementary Information 6
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Lei, C. et al. Lemon Exosome-like nanoparticles-manipulated probiotics protect mice from C. d iff Infection. i Science 23, 101571. 10.1016/j.isci.2020.101571 (2020).
- 2Takakura, H. et al. Citrus limon L.-Derived nanovesicles show an inhibitory effect on cell growth in p 53-Inactivated colorectal cancer cells via the macropinocytosis pathway. Biomedicines 1010.3390/biomedicines 10061352 (2022).
- 3Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol.1010.1186/gb-2009-10-3-r 25 (2009).
