Distinct Proteomic and Glycosylation Signatures Differentiate A549 Tumor and BEAS-2B Nontumor Cell Line–Derived Small Extracellular Vesicles
Mirjam Balbisi, Tamás Langó, Virág Nikolett Horváth, Domonkos Pál, Gitta Schlosser, Gábor Kecskeméti, Zoltán Szabó, Kinga Ilyés, Nikolett Nagy, Otília Tóth, Jing Zheng, Guinevere S.M. Lageveen-Kammeijer, Tamás Visnovitz, Zoltán Varga, Beáta G. Vértessy, Lilla Turiák

TL;DR
The study compares protein and sugar patterns in small vesicles from cancerous and non-cancerous cells to reveal distinct molecular signatures.
Contribution
The work introduces glycosylation profiling as a novel method to distinguish tumor-derived extracellular vesicles from non-tumor ones.
Findings
A549 sEVs show enrichment in complex N-glycans and proteins related to cell cycle and metabolism.
CS/DS content is 3.4-fold higher in A549 sEVs compared to BEAS-2B sEVs.
Glycan-level analysis provides greater sensitivity than proteomics alone in differentiating sEV origins.
Abstract
Extracellular vesicles (EVs) are central to intercellular communication and have gained attention as rich sources of molecular information in cancer research, but their molecular composition remains incompletely characterized. Protein glycosylation is a frequent post-translational modification; however, most EV studies focus on proteomics, whereas mapping glycosylation changes of proteins is still under-represented. To address this gap, we analyzed the proteomic, N-glycoproteomic, and chondroitin sulfate/dermatan sulfate (CS/DS) glycosaminoglycan (GAG) profiles of small EVs (sEVs) derived from A549 lung adenocarcinoma and BEAS-2B nontumorigenic epithelial cells. Principal component analysis and hierarchical clustering revealed that all three profiles strongly reflect sEV origin. Comparative proteomic analysis showed enrichment of proteins associated with cell cycle regulation, DNA…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExtracellular vesicles in disease · Protease and Inhibitor Mechanisms · Nanoplatforms for cancer theranostics
Lung cancer remains a major global health burden, with nearly 2.5 million cases and 1.8 million deaths reported worldwide each year (1). Its high mortality is largely attributed to late-stage diagnosis, at which point curative surgery is no longer feasible (2). In addition, while several targeted and immunotherapeutic treatment options are now available for patients with specific genetic (3) or immunological (4) profiles, selecting the most effective therapy is complex and frequently requires invasive tissue biopsies to identify specific/targetable molecular alterations. Therefore, there is an urgent need to develop less invasive methods to detect lung cancer and identify features for therapy selection, for example, based on extracellular vesicles (EVs) in blood and other body fluids (5, 6).
EVs are lipid-bound particles released from cells into the extracellular space that play a significant role in intercellular communication and carry various biomolecules, including proteins, lipids, and nucleic acids (7, 8), influencing the tumor microenvironment, cancer progression, and metastasis (9, 10). Traditionally, EVs were classified according to their biogenesis and size into exosomes (30–150 nm, endosomal origin), microvesicles (100–1000 nm, plasma membrane shedding), and apoptotic bodies (>1 μm, released during apoptosis). However, because of overlapping size ranges and the lack of definitive markers distinguishing vesicle subtypes, recent guidelines recommend the use of operational terms based on size, such as small EVs (sEVs, <200 nm) and large EVs (>200 nm) (11). Biologically, sEVs have been implicated in the modulation of several processes during tumor development, such as angiogenesis, cell transformation, invasion, metastasis, immune escape, and drug resistance (12, 13). Tumor-derived sEVs are detectable in various body fluids, including blood and urine, making them promising targets for minimally invasive cancer biomarker discovery and therapeutic investigation (14). To date, biomarker efforts have primarily focused on nucleic acids and proteins, whereas specific modifications of EV proteins remain largely uncharacterized.
Over the past 3 decades, mass spectrometry (MS)–based bottom–up workflows have revolutionized proteomics by enabling the simultaneous identification and quantification of large numbers of proteins (15, 16). Proteomic profiles of EVs are widely studied in several types of cancer, for example, breast (17), colorectal (18), prostate (19) and lung cancer (20). Studies have shown that the protein cargo of EVs reflects both the molecular state of the cells of origin and ongoing disease processes, and that tumor-derived EV proteomes are enriched in proteins linked to metastasis and tumor progression (21, 22). In lung cancer, proteomic analyses have identified several differentially expressed EV proteins in the plasma of patients compared with healthy controls, with talin-1 and tubulin alpha-4A chain showing the highest diagnostic potential, whereas adenocarcinoma and squamous cell carcinoma histological subtypes also showed different proteomic profiles (20). Another study demonstrated that EV proteomes from non–small cell lung cancer (NSCLC) patients are enriched in proteins associated with extracellular membrane–receptor interaction, focal adhesion, and actin cytoskeleton regulation, further emphasizing the usefulness of EV proteomic signatures in cancer detection and monitoring (23).
Proteins carry several post-translational modifications that generate distinct proteoforms, which can each have a unique structure and function. However, the post-translational modifications occurring on EV proteins, and thus the full proteoform landscape, are still relatively underexplored (24, 25). Protein glycosylation is observed in over 50% of human proteins and is integral to several biological processes, influencing both cellular interactions (cell–cell interaction, signal transduction, and immune response) and protein dynamics (protein folding and molecular recognition) (26). Major protein glycosylation subtypes include N-glycosylation and O-glycosylation. A special class of glycosylated proteins are the proteoglycans (PGs), in which glycosaminoglycans (GAGs) are attached to a core protein. In the current study, N-glycosylation and chondroitin sulfate/dermatan sulfate (CS/DS, a class of GAGs) are investigated.
Human N-glycans have a common pentasaccharide core structure of two N-acetylglucosamine (GlcNAc) and three mannose (Man) units, which can be extended into oligomannose, complex, and hybrid types. The N-glycoproteome can be analyzed using two complementary approaches: enzymatic release of N-glycans, yielding a global N-glycome profile (N-glycomics), or proteolytic digestions followed by enrichment of glycopeptides to obtain site-specific information (N-glycoproteomics) (27). Aberrant N-glycosylation was observed in several cancer types, including lung (28), colorectal (29) and breast (30) cancer. Notably, many tumor biomarkers currently approved for clinical use are also glycoproteins, including α-fetoprotein for liver cancer, prostate-specific antigen for prostate cancer, and carcinoembryonic antigen for colorectal cancer (31). Although clinical assays typically quantify total protein levels, rather than glycosylation features, this prevalence underscores glycoproteins as a central and largely underexploited class of cancer biomarkers.
GAGs are long, linear polysaccharides composed of repeating disaccharide units and are classified based on disaccharide structures (32). CS/DS consists of glucuronic acid/iduronic acid linked to N-acetylgalactosamine (GalNAc) units, with sulfation occurring primarily at the C4 and C6 positions of GalNAc and less frequently at the C2 position of the uronic acid (33). CS/DS is typically analyzed by bottom–up techniques, which involve bacterial lyase enzymatic digestion of the chain, followed by HPLC–MS to identify and quantify the different disaccharides present in the sample (34). Altered GAG abundance and sulfation patterns have been observed across multiple cancer types, including lung (35), prostate (36) and liver (37) cancer.
Together, these glycan layers are integrated at EV proteins, where glycosylation critically regulates vesicle biogenesis, molecular cargo, and cell–cell communication. Accordingly, altered EV glycan profiles are strongly associated with cancer development, and plasma-derived EVs represent a highly relevant clinical source for lung cancer biomarker discovery. However, comprehensive EV glycan analysis in plasma is challenged by the coisolation of non-EV particles, high background levels of abundant plasma proteins, limited sample availability, and the coexistence of tumor- and non–tumor-derived EVs within plasma. As a result, controlled cell culture models, where EV origin and sample homogeneity can be tightly controlled, remain essential for method development, analytical benchmarking, and hypothesis generation. It is important to note that studies based on comparisons between individual cell lines are unable to capture the full biological heterogeneity of cancer and should therefore be considered as representative of defined model systems rather than the disease in general.
In the present study, we performed an integrated proteomic, N-glycoproteomic, and CS/DS GAG analysis of sEVs derived from the A549 lung adenocarcinoma cell line, representing the most common lung cancer subtype (38), and the BEAS-2B epithelial cell line, isolated from noncancerous bronchial epithelium. This systematic benchmarking defines the performance and complementarity of advanced proteomic and glyco(proteo)mic workflows and reveals cell line–specific sEV features, providing a foundational reference for subsequent studies of plasma-derived EVs.
Experimental Procedures
Materials
A549 and BEAS-2B cells, trypsin–EDTA, PBS, poly-l-lysine, ammonium acetate, ammonium bicarbonate (AmBic), ammonium formate, formic acid (FA), TFA, iodoacetamide, DTT, chondroitinase ABC, acetic acid (HAc), 1-hydroxybenzotriazole, 50% NaOH solution, HCl, EDTA, Tris–HCl, NaCl, protease inhibitor cocktail tablets, NP-40, ammonia solution (25%; used for N-glycomics), and guanidine hydrochloride solution (8 M, pH 8.5) were obtained from Merck; LC–MS-grade acetonitrile (ACN), water, and methanol (MeOH) were purchased from VWR. Pierce C_18_ and graphite spin columns, Ham's F-12 Nutrient Mix (supplemented with GlutaMAX), fetal bovine serum, penicillin–streptomycin, and 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide were acquired from Thermo Fisher Scientific. Trypsin/Lys-C and Trypsin Gold were obtained from Promega. Econo-Pac chromatography columns were purchased from Bio-Rad, and Sepharose CL-2B was purchased from Cytiva. Rapigest SF Surfactant was acquired from Waters Corporation, CS disaccharide standards were purchased from Iduron, and bronchial epithelial growth medium (BEGM) BulletKit was purchased from Biocenter. Acetone was purchased from Honeywell, and ammonia (25%) was purchased from Reanal. PNGase F was obtained from Roche Diagnostics GmbH. Girard’s reagent P (GirP) was purchased from TCI Development Co, Ltd. Polyvinylidene fluoride (PVDF) membranes (0.45 μm pore size) were obtained from Merck Millipore. For N-glycomics, LC–MS-grade ACN, LC–MS-grade ethanol (EtOH), and ultrapure water were purchased from BioSolve BV.
Cell Culturing
A549 cells were grown and maintained in F12 medium completed with 10% fetal bovine serum and 1% penicillin–streptomycin inside a humidified incubator with 5% CO_2_ at 37 °C (Galaxy 170R; Eppendorf). BEAS-2B cells were maintained in BEGM in similar conditions after the flasks were precoated with 0.01% poly-l-lysine for 20 min and washed with water twice. Two T75 flasks at 50% confluence were incubated in 10 ml serum-free medium (noncompleted F12 and completed BEGM) for 72 h before the start of sEV isolation. Passage numbers of the cells used for sEV isolation are shown in Supplemental Table S1.
Experimental Design and Statistical Rationale
For proteomic, N-glycoproteomic, and GAG analyses, six sEV full technical replicates were used from both the A549 and BEAS-2B cell lines. Each replicate originated from the same cell line under identical culture conditions and therefore reflects variations in cell culture and analytical workflow. To avoid misidentifications because of differences in cell culture media, F12 and BEGM samples were also prepared as controls and analyzed in three technical replicates. To establish a reliable N-glycoproteomics database of the most prominent glycan compositions, N-glycomics analysis was performed using one replicate from each EV as well as each culture medium type. The number of full technical replicates was selected within the budgeted allowance, and preliminary power analysis was not performed. Samples were injected in random order in all cases to minimize run-order bias. LC–MS system performance and reproducibility were monitored by repeated injections of HeLa peptide digest or glycopeptide-enriched HeLa peptide digest throughout the measurement series. Methodological details and statistical rationale for each analysis are described below in the “N-Glycomics Workflow,” “Proteomics and N-glycoproteomics Workflow,” and “CS/DS Workflow” sections.
sEV Isolation
After 72 h of incubation, cell culture supernatants were collected, and sEVs were isolated as described previously (39), with slight modifications. Briefly, solutions were centrifuged at 10,000g at 4 °C for 30 min to remove cell debris, apoptotic bodies, and microvesicles, and supernatants were filtered through a 0.22 μm filter. The filtrates were concentrated to 1 ml on 10 kDa Amicon Ultra-15 centrifugal filters at 5000g at 4 °C, and sEVs were isolated from the resulting samples on in-house prepared size-exclusion chromatography (SEC) columns (39) filled with 10 ml Sepharose CL-2B, using 1 ml PBS to elute the fractions. Six fractions were collected, and based on preliminary size distribution and protein concentration measurements, fractions 3, 4, and 5 were combined for further analysis.
sEV Characterization
Microfluidic Resistive Pulse Sensing Measurements
Fractions 3, 4, and 5 containing sEVs were combined, and the mixture was diluted fivefold with prefiltered PBS containing 0.3% (w/v) Tween-20, where the detergent solution was passed through a 100 kDa Vivaspin 500 membrane filter prior to use. Samples with a volume of 5 μl were pipetted into C400 cartridges and measured using an nCS1 instrument (Spectradyne LLC) with a measurement range of 65 to 400 nm.
Transmission Electron Microscopy
Samples for transmission electron microscopy (TEM) were prepared as described previously (40), with slight modifications. Briefly, 3 μl of samples were deposited on formvar-coated grids and dried for 10 min. EVs were fixed with 2% glutaraldehyde in PBS for 10 min and washed three times with water for 5 min. EVs were contrasted with 2% methyl cellulose containing 0.4% UranyLess (Electron Microscopy Sciences) for 10 min on ice. Measurements were performed on a JEM1010 (Jeol) transmission electron microscope, and images were analyzed by ImageJ (National Institutes of Health). Diameters of sEVs (N = 50–50) were determined.
Solvent Exchange and Lysis
Collected SEC fractions were concentrated, and the solvent was exchanged for MS-compatible AmBic buffer. Therefore, 10 kDa Amicon Ultra-0.5 centrifuge filters were first washed with 200 μl of water and centrifuged at 13,500g for 10 min. The sample was then pipetted onto the filters in 500 μl units and centrifuged at 13,500g for 10 min in each step. Subsequently, 200 μl of 200 mM AmBic solution was added and centrifuged at 13,500g for 10 min, followed by the addition of 200 μl of 50 mM AmBic solution and centrifugation at 13,500g for 15 min. Finally, the filters were turned upside down and centrifuged at 1000g for 1 min to collect the sEV fractions. The sEVs were lysed with seven consecutive freeze–thaw cycles, using 30 s of liquid nitrogen (cycles 1, 3, 5, 6, and 7) or 1 h of freezing (cycles 2 and 4), followed by 10 min of ultrasonication each time. Protein concentrations were measured on a NanoDrop ND-1000 instrument (Thermo Fisher Scientific) at 280 nm using bovine serum albumin calibration solutions from 0.1 μg/μl to 10 μg/μl. In cases where protein amounts were <15 μg (commonly observed for BEAS-2B sEVs), two samples were combined for further analysis. Further sample preparation steps were performed on six sEV samples of both cell types and three control media samples of both media, except for N-glycomics, where one sample from each type (A549 and BEAS-2B sEVs, F12, and BEGM) was analyzed.
N-Glycomics Workflow
N-Glycan Release
N-glycans were enzymatically released from EV proteins using a PVDF membrane–based immobilization approach (41). For each sample, 4 μg of total protein was dried down and resuspended in 25 μl lysis buffer (0.5 mM EDTA, 100 mM NaCl, 50 mM Tris–HCl, and 1x protease inhibitor cocktail), followed by sonication for 30 min. PVDF membranes were preconditioned by sequential washing with 70% EtOH (200 μl) and water (200 μl) by centrifugation (500g, 1 min) and subsequently rewetted with 5 μl of 70% EtOH. Samples were loaded onto the membranes and incubated for 30 min at room temperature with shaking at 300 rpm. Proteins were denatured and reduced on-membrane by the addition of 72.5 μl of 8 M guanidine hydrochloride and 2.5 μl of 200 mM DTT, followed by incubation for 30 min at 60 °C in a humidified chamber. The flow-through was removed by centrifugation (1000g, 1 min), and the membranes were washed twice with 200 μl of water. Membranes were then incubated with 200 μl of 0.01% NP-40 and washed three times with 200 μl of water. During these steps, samples were incubated for 10 min at 300 rpm prior to centrifugation (400g, 1 min). For N-glycan release, 2 U of PNGase F in 13 μl of water was added to each membrane, followed by shaking for 5 min at 300 rpm (room temperature) and additional incubation at 37 °C for 15 min. An additional 20 μl of water was then added, and samples were incubated overnight (>17 h) at 37 °C in a humidified chamber. Released N-glycans were collected by centrifugation (1000g, 2 min), and membranes were washed twice with 40 μl of water (5 min shaking at 300 rpm, followed by centrifugation at 1000g, 2 min). Combined eluates were dried and subjected to linkage-specific sialic acid derivatization.
N-Glycan Derivatization and Purification
Sialic acids were stabilized and neutralized by the linkage-specific ethyl esterification derivatization (42). To the dried samples, 20 μl of 0.25 M 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide, 0.25 M 1-hydroxybenzotriazole solution in EtOH was added, followed by incubation for 30 min at 37 °C. Subsequently, 4 μl of 25% ammonia was added, and samples were incubated for an additional 30 min at 37 °C. Finally, 24 μl ACN was added, and cotton-hydrophilic interaction chromatography (cotton-HILIC) purification (43) was performed on the derivatized N-glycans. Self-packed cotton-HILIC tips were equilibrated with 3 × 20 μl of water, followed by 3 × 20 μl of 85% ACN by pipetting through the tips and discarding the flow-through. Samples were loaded by pipetting up and down 20 times, washed with 3 × 20 μl 1% TFA in 85% ACN and 3 × 20 μl 85% ACN, and N-glycans were eluted with 20 μl water. Eluates were dried and stored at −20 °C until GirP labeling, which introduces a permanent positive charge to the reducing end of the N-glycan (42).
CE–MS/MS measurements
Prior to analysis, 2 μl of GirP labeling solution (50 mM GirP in 10% HAc and 90% EtOH) was added to each dried sample, followed by mixing and incubation for 1 h at 60 °C. Samples were dried and reconstituted in 5 μl 400 mM ammonium acetate leading electrolyte (pH 2.9). Analyses were performed on a CESI 8000 plus system (Sciex) coupled to a Maxis Plus Q-TOF (Bruker Daltonics) via a sheathless CE–electrospray ionization–MS interface. Separations were carried out using a bare fused-silica capillary (total length 91 cm, 30 μm inner diameter, 150 μm outer diameter; Sciex). Prior to each run, the capillary was sequentially rinsed at 100 psi with 0.1 M NaOH (2.5 min), 0.1 M HCl (3 min), water (4 min), and background electrolyte (BGE; 20% HAc, 3 min). The conductive line was washed with BGE for 3 min at 75 psi. Samples were introduced by hydrodynamic injection at 10 psi for 60 s (13.6% of the capillary volume, 86 nl), followed by injection of a BGE postplug at 0.5 psi for 25 s (0.3% of the capillary volume). Samples were maintained in the sample tray at 10 °C and analyzed sequentially. Separations were performed at 0.5 psi pressure, 20 kV separation voltage, and a capillary temperature of 20 °C.
MS detection was performed in positive ion mode with a capillary voltage of 1300 V and a mass range of m/z 200 to 2200. The drying gas temperature was set to 150 °C with a flow rate of 1.2 l/min. Collision cell energy and quadrupole ion energy were set to 5.0 eV, and the prepulse storage time was 25.0 μs. Ionization efficiency was enhanced using a dopant-enriched nitrogen gas, with ACN applied as dopant at 0.2 bar (44).
N-Glycomics Data Evaluation
MS data were calibrated (linear correction) using the GirP cluster in Data Analysis 6.1 (Bruker Daltonics) and converted to mzML format. Further processing of the N-glycomics data was performed in GlycoGenius 1.3.1 (45). A theoretical library was generated containing glycans composed of 5 to 22 monosaccharides, including 3 to 10 hexoses (Hex), 0 to 2 deoxyhexoses (dHex), 2 to 8 N-acetylhexosamines (HexNAc), 0 to 4 N-acetylneuraminic acids (Neu5Ac), and one phosphorylation/sulfation. GirP reducing-end tagging and amidation/ethyl-esterification modifications were enabled. Proton adducts were considered, allowing up to three charges in positive ion mode. Identifications were filtered using the following criteria: minimum isotopic fitting score of 0.8, minimum curve fitting score of 0.9, minimum signal-to-noise ratio of 3, and mass error within −10 to +15 ppm. Subsequently, the data were further curated based on migration order. Glycan compositions that comigrated or migrated earlier were constrained not to exceed the size of later-migrating species, and sulfated/phosphorylated compositions were not allowed to comigrate with or migrate earlier than neutral species.
Proteomics and N-Glycoproteomics Workflow
Proteomics Digestion
From each sample, 5 μg protein was diluted to 35 μl with water, 1.5 μl MeOH, 2 μl 200 mM DTT, and 5 μl 0.5% Rapigest was added, and incubated at 60 °C for 30 min. In addition, 2.5 μl 200 mM iodoacetamide and 5 μl 200 mM AmBic solution were added to the samples and incubated at room temperature in the dark for 30 min. The digestion was performed in two consecutive steps: first, 50 ng of trypsin/Lys-C mixture was added to the samples and incubated for 1 h at 37 °C, followed by incubation with 500 ng of trypsin enzyme for another 15 h at 37 °C. Finally, the digestion was stopped by adding 0.5 μl FA, and the peptide samples were dried down.
N-Glycopeptide Enrichment and Peptide Purification
For enrichment of N-glycopeptides, acetone precipitation was used by first dissolving the peptide samples in 15 μl 1% FA, then adding 150 μl ice-cold acetone, and finally storing the samples in the freezer for 18 h (46, 47). The samples were then centrifuged at 12,000g for 10 min at 20 °C, and the supernatants (mostly nonglycosylated peptides) were separated from the pellets (mostly glycosylated peptides). Both fractions were dried down, and the nonglycosylated peptides were purified on a C_18_ spin cartridge. In short, the cartridge was washed with 2 × 200 μl 50% MeOH, 2 × 200 μl 0.5% TFA + 5% ACN, and 2 × 200 μl 0.1% TFA solution. Samples were applied in 50 μl 0.1% TFA solution and reapplied once. Contaminants were washed away with 2 × 100 μl 0.1% TFA, and peptides were eluted with 2 × 50 μl 0.1% TFA + 70% ACN solution. All steps were performed at 2000 rpm for 2 min. Elution solvents were evaporated, and both peptides and N-glycopeptides were stored at −20 °C until further use.
nanoUHPLC–MS/MS Measurements for Proteomics
Proteomic samples were dissolved in 0.1% FA + 2% ACN solution, and 200 ng was injected from each sample. Samples were analyzed on a timsTOF HT (Bruker Daltonics) coupled with a Dionex Ultimate 3000 nanoUHPLC (Thermo Fisher Scientific). Samples were first loaded onto an Acclaim PepMap C_18_ trap column (5.0 μm, 300 μm × 5 mm; Thermo Fisher Scientific) at a flow rate of 10 μl/min, followed by separation on a monolithic capillary MOSAIC C_18_ analytical column (75 μm × 150 mm; Bruker) heated at 50 °C. Eluent A consisted of 0.1% FA in water, whereas eluent B was 0.1% FA in 80% ACN. The gradient started at 5% B and increased to 40% B over 20 min at a flow rate of 0.5 μl/min.
The MS ion source was a CaptiveSpray 1 source with a 10 μm emitter used in positive mode, with a capillary voltage of 1500 V. Data-independent acquisition (DIA) was performed using DIA–parallel accumulation–serial fragmentation (PASEF). The mass range was set to m/z 100 to 1700, and the ion mobility range was 0.7 to 1.4 V·s/cm^2^. The trapped ion mobility spectrometry settings included a ramp time of 180 ms and an accumulation time of 180 ms. Transfer time was 60 μs, and prepulse storage time was 12 μs. To optimize the DIA windowing scheme, py_diAID (48) was used, which consisted of 24 contiguous, nonoverlapping DIA windows arranged as 12 ion mobility slices with 2 m/z ranges per slice, covering the full ion mobility-m/z plane. DIA window schemes are provided in Supplemental Table S2. A total of four DIA–PASEF MS/MS scans were acquired per cycle, and the cycle time was 0.93 s. MS1 data were acquired as part of each DIA–PASEF cycle.
Precursors were selected within a charge range of 0 to 5, with an intensity threshold of 1500 for scheduling and a target intensity of 15,000. Exclusion release time was 0.4 min, the reconside precursor switch was enabled, and a current-to-previous intensity ratio of 4 was set. Exclusion windows were set to 0.015 m/z for mass width and 0.015 V·s/cm^2^ for ion mobility width. Raw files (.d format) were directly processed by DIA-neural network (NN) without format conversion.
nanoUHPLC–MS/MS Measurements for N-Glycoproteomics
Glycoproteomic samples were dissolved in 10 μl 0.1% FA + 2% ACN solution, of which 2 μl was injected. Measurements were performed on a Waters nanoAcquity nanoUHPLC system coupled to a Thermo Fisher Exploris 240 Orbitrap MS. Chromatographic separation utilized a Symmetry C_18_ trap column (5 μm, 180 μm × 20 mm; Waters) and an Acquity M-Class BEH130 C_18_ analytical column (1.7 μm, 75 μm × 250 mm; Waters).
Eluent A consisted of 0.1% FA in water, whereas eluent B was 0.1% FA in ACN, and the flow rate was 300 nl/min. The gradient program started from 2% B (0–1 min), increased from 2% to 25% B (2–82 min), then from 25% to 40% B (82–85 min) and from 40% to 90% B (85–86 min), kept there for 2 min (86–88 min), and finally, the column was re-equilibrated at 2% B (88–90 min).
The MS was operated in positive ion mode with a capillary temperature of 275 °C and a capillary voltage of 1.8 kV. MS full scans were acquired at a resolution of 120,000 in the mass range of 360 to 2200 Da, with an automatic gain control (AGC) target of 2 × 10^6^ to maintain consistent signal intensity and a maximum injection time of 200 ms. Ions were selected within a 2 Da isolation window for MS/MS, and stepwise higher energy collisional dissociation fragmentation energies of 10, 20, and 30 eV were used. The resolution was maintained at 120,000, with an AGC target of 2 × 10^5^, a maximum injection time of 200 ms, and a mass range of 200 to 2000 Da. The minimum precursor intensity threshold was set to 1.7 × 10^4^ and the minimum AGC target to 10^3^. Vendor raw files were converted to mzML using ProteoWizard MSConvert (version 3.0.21333) with vendor peak picking enabled for both MS1 and MS2 levels and default compression settings.
Proteomics Data Evaluation and Visualization
DIA-NN 1.9 (49) was used to identify and quantify proteins. Peak list generation, demultiplexing, deisotoping, and assignment of retention time, ion mobility, and intensity to peaks were performed internally by DIA-NN’s processing engine. An observed fragment ion was assigned to a maximum of one precursor peak list. A spectral library was generated in silico from the UniProt human database (access date: March 2024, 20,434 sequences) using DIA-NN’s deep learning–based prediction of MS/MS spectra and retention times. Trypsin/P enzyme was used, and carbamidomethylation of cysteine amino acids was set as a fixed modification, whereas methionine oxidation, N-terminal methionine excision, and protein N-terminal acetylation were set as variable modifications. A maximum of one missed cleavage site and one variable modification was allowed. Mass tolerance for precursor and fragment ions was automatically determined by the software based on the first run in the experiment. False discovery rate control was set at 1% at the precursor level using DIA-NN’s built-in target-decoy approach, and decoy sequences were generated by sequence shuffling. For quantification, DIA-NN extracted fragment ion chromatograms and performed interference correction and retention time–dependent cross-run normalization. Match between runs was enabled to improve peptide identification across samples. The QuantUMS (high precision) quantification strategy was used, in which DIA-NN selects up to six fragment ions per precursor for quantification, choosing those with the highest predicted relative intensities and lowest interference from the in silico spectral library. Additional fragment ions were considered during the scoring stage for peptide identification. Protein intensities were calculated using DIA-NN’s MaxLFQ-like algorithm. DIA-NN settings are summarized in Supplemental Table S3. All quantification results and peptide numbers used are provided in Supplemental Table S4.
Statistical evaluation and visualization of the results were performed with custom code in R (50) 4.3.2 using RStudio (51) 2024.12.1+467. Proteins identified with less than two unique peptides or detected in at least one of three of the control media samples were excluded, and only those proteins quantified in at least half of the samples in at least one sample group were considered for further analysis. Imputation of missing values was performed based on the number of detections in each group: if the given protein was detected in less than two of three of the samples in the group, the sample’s five-percentile was imputed, whereas in the case of fewer missing values, it was imputed using the k-nearest neighbors algorithm (VIM package (52), k = 15). Normality and equality of variances were tested on log-transformed data using Shapiro–Wilk and Levene tests, respectively, and based on the outcome, Student’s t test, Welch's t test, or Wilcoxon rank-sum test was performed for the given protein. False discovery rates were controlled with the Benjamini–Hochberg method, and adjusted p values less than 0.05 were considered significant.
For visualization, the packages ggplot2 (53) and gplots (54) were used, in which principal component analysis (PCA, prcomp function), hierarchical clustering (heatmap.2 function, ward.D2 method), volcano diagram, and boxplots were generated. Gene set enrichment analysis (GSEA) was conducted using clusterProfiler (55) on ranked genes from statistical analysis, identifying enriched Gene Ontology Biological Processes (adjusted p value cutoff = 0.1 was used).
N-Glycoproteomic Data Evaluation and Visualization
Proteins included in proteomics statistical analysis were filtered for known glycosylation sites, and only proteins confirmed to be N-glycosylated were used for further analysis (214 proteins; see Supplemental Table S5). N-glycans identified by N-glycomics were normalized to total ion intensities and averaged across A549 and BEAS-2B sEV samples. Only N-glycans with an average relative abundance of ≥0.25% (equivalent to ≥0.5% in at least one of the samples) were included. In cases of structural ambiguities for N-glycan identifications, the biosynthetically most plausible structure was considered. All remaining structures were manually inspected, and implausible structures were excluded. Identified N-glycans were converted into nonderivatized species, yielding a total of 34 N-glycan compositions (Supplemental Table S5). N-glycopeptides were quantified using GlycReSoft (56) 0.4.22 with the protein and N-glycan lists provided in Supplemental Table S5. Trypsin enzyme was used, carbamidomethylation of cysteine was set as fixed, whereas oxidation of methionine was set as a variable modification. The number of missed cleavages was limited to 2, MS1 mass accuracy was set to 10, and MS2 mass accuracy was set to 20 ppm. Glycopeptide identifications were filtered at q < 0.05 using GlycReSoft’s built-in target-decoy approach with reverse-peptide decoys and permuted decoy glycans, corresponding to an estimated <5% false discovery rate at the glycopeptide-spectrum match level. Detailed settings of GlycReSoft searches are shown in Supplemental Table S3. The complete list of identified and quantified N-glycopeptides is provided in Supplemental Table S6. Annotated spectra for all glycosylated peptides are provided as a supplementary folder. Glycan compositions were assigned based on accurate mass and diagnostic fragment ions. Without determination of linkage positions or anomericity, isomeric compositions were not distinguished beyond composition-level assignment.
Further processing of the results was performed in R. N-glycopeptide assignments with MS1 score >3 and MS2 score >5 were accepted (57), and cell culture media–derived glycopeptides were removed as in the case of proteomics, unknown N-glycosylation sites were removed, and total signal intensities were normalized using total area normalization. Statistics and visualization of glycoproteomics data were performed in the same way as described for proteomics, and glycoforms were screened to ensure that they belong to known glycosylation sites. All intensity values (computed or imputed) were used without outlier removal. Glycosylation metrics were used to characterize sialylation (the ratio of sialylated antennae), fucosylation (the average number of fucoses on one glycopeptide), and the ratio of different types of glycopeptides.
CS/DS Workflow
CS/DS Digestion
Protein samples (10 μg) were adjusted to a final volume of 70 μl with water, followed by the addition of 20 μl 500 mM AmBic solution, and 5 μl 5 mU/μl chondroitinase ABC enzyme was added, and incubated at 37 °C for 16 h. Digestion was stopped by placing the samples at 90 °C for 3 min, and the samples were dried down and stored at −20 °C until the purification.
CS/DS Disaccharide Purification
For the purification of CS/DS disaccharides, we used a cotton-HILIC + graphite solid-phase extraction two-step procedure developed in our group. In each step, samples were centrifuged at 2500 rpm for 1 min. In the first part, self-packed cotton-HILIC pipette tips were used, which were first washed with 50 μl 60% ACN solution, followed by 2 × 50 μl 1% TFA + 95% ACN solution. Samples were applied and reapplied twice in 30 μl 1% TFA + 95% ACN solution, then washed with 50 μl 1% TFA + 95% ACN, and eluted with 2 × 10 μl 1% ammonia solution preheated to 40 °C. Flow-through (from sample application and washing) and elution fractions were dried down, and flow-throughs were further purified on Thermo Pierce graphite cartridges. To do this, 2 × 100 μl 0.1% TFA + 80% ACN solution was used, followed by 2 × 100 μl water. Samples were applied and reapplied once in 50 μl water, washed with 3 × 100 μl water, and eluted with 3 × 50 μl 0.05% TFA + 40% ACN solution. The elution fractions were combined with the cotton-HILIC elution fractions, dried down, and stored at −20 °C until further use.
UHPLC–MS/MS Measurements
CS/DS disaccharide samples were dissolved in 8 μl 10 mM ammonium formate + 75% ACN (pH 4.4) solution, of which 2 μl was injected. Samples were measured on a Waters Select Series Cyclic IMS coupled to a Waters Acquity I-Class UPLC equipped with a self-packed GlycanPac AXH-1 HILIC-weak anion exchange capillary column (250 μm × 10 cm). A and B solvents were 10 mM and 65 mM ammonium formate + 75% ACN (pH 4.4) solution (58). The flow rate was 10 μl/min, and CS/DS disaccharides were separated with constant 5% B for 7 min, followed by washing with 95% B for 5 min and equilibration with 5% B for 3 min. Extracted ion chromatograms of characteristic ions for CS/DS disaccharides are shown in Supplemental Figure S1.
A low-flow electrospray source was operated in negative mode, with a capillary voltage of 2.5 kV, a cone voltage of 10 eV, and an ion source temperature of 120 °C. MS1 spectra were collected with a trap collision energy of 6 eV and transfer collision energy of 4 eV in the m/z 200 to 600 mass range, whereas MS/MS spectra were taken at 20 eV collision energy to differentiate between D0a4 and D0a6.
CS/DS Data Evaluation and Visualization
TargetLynx integrated in MassLynx V4.2 software was used to integrate the chromatographic peaks of GAG disaccharides. CS/DS disaccharides were identified by matching accurate precursor masses (±0.08 Da) and retention times to those of purified CS disaccharide standards. Identifications were made at the disaccharide composition level. Sulfation positions and stereochemistry were not directly analyzed but were assumed according to the known structures of the purified CS/DS disaccharide standards used for calibration, except for 4-O- and 6-O-sulfation, which were differentiated by MS/MS. For each disaccharide standard, major ions were determined, and their summed extracted ion chromatogram peak areas were used for quantification (Supplemental Table S7). In biological samples, the same adducts, retention times, and mass tolerances were applied for peak integration. Signals not meeting the mass and retention time criteria were excluded to remove background or contaminant peaks. Quantitation was performed by fitting calibration curves to samples containing known amounts of CS disaccharides in Microsoft Excel, and results were expressed as fmol amounts. For the D0a4–D0a6 pair, the characteristic fragment ion peaks at m/z 282 and 300 were integrated, and the total fmol amount was apportioned according to their measured ratio in the sample. CS/DS disaccharides could not be detected in one A549 sample, presumably because of sample preparation errors; the sample was therefore excluded from the analysis. Results were plotted and statistically evaluated in R 4.3.2 using RStudio 2024.12.1+467 with custom code, similar to proteomics and N-glycoproteomics. In short, boxplots, PCA, and heatmaps were used for visualization, and Student’s t test, Welch's t test, or Wilcoxon rank-sum test were used for statistics, and Benjamini–Hochberg correction was applied.
Results
This study is based on six independent sEV preparations isolated from the cell culture supernatant of A549 adenocarcinoma as well as BEAS-2B nontumorigenic epithelial cells, with matched cell culture media serving as controls (n = 3 per media). All samples were characterized in accordance with the Minimal Information for Studies of EVs 2023 guidelines (11). An overview of the workflow is presented in Figure 1.Fig. 1Workflow of sEV isolation and integrated proteomic, N-glycoproteomic, and CS/DS disaccharide analysis of A549 and BEAS-2B sEVs. Created with BioRender.com. CS, chondroitin sulfate; DS, dermatan sulfate; sEV, small extracellular vesicle.
First, proteomic digestion was performed, and the resulting peptide mixture was enriched for N-glycopeptides, whereas the remaining fraction was used for proteomic analysis. In addition to microfluidic resistive pulse sensing (MRPS) and TEM analysis, DIA proteomics was used to verify the purity of sEVs by the presence of EV marker proteins. We then assessed protein expression differences between normal and tumor sample groups, with a special focus on PG core proteins. PCA and heatmap clustering were used to detect system-level differences, and GSEA was performed to identify altered biological processes. N-glycoproteomics results were interpreted using statistics, volcano plot, PCA, and heatmap. In addition, we determined the distribution of different types of glycans and calculated glycosylation metrics to characterize overall N-glycomics changes. Glycoproteomic results were also correlated with proteomics. On the other hand, CS/DS GAG chains were digested into CS/DS disaccharides. The GAG disaccharides investigated in this study are shown in Figure 2. The total amount of CS/DS disaccharides, their relative proportion, the D0a6/D0a4 ratio (hereafter: 6S/4S), and the average rate of sulfation were calculated, and PCA and heatmap analysis were performed.Fig. 2Lawrence codes and corresponding structures of CS/DS disaccharides generated by enzymatic digestion. CS, chondroitin sulfate; DS, dermatan sulfate.
sEV Characterization
First, we examined the size and shape of the isolated sEVs by TEM and characterized their size distribution by MRPS. Results for A549 and BEAS-2B sEVs are shown in Figure 3, A and B, respectively. TEM analysis confirmed the presence of spherical particles that are smaller than 200 nm size. In general, smaller particles were found in A549 samples than in BEAS-2B. In A549, most particles were between 30 and 120 nm in size according to TEM, whereas in BEAS-2B, they were mainly between 50 and 250 nm (Fig. 3C). MRPS analysis indicated slightly lower particle diameters, but the same trend in the size of the isolated particles, that is, a higher proportion of larger particles was found in BEAS-2B sEVs than in A549 sEVs. Another difference was observed in the particle concentration, as A549 samples had a considerably higher number of sEVs. To reduce differences, enzymatic digestion was performed on equal amounts of protein.Fig. 3Transmission electron microscopy (TEM) and microfluidic resistive pulse sensing (MRPS) analysis. A, A549 sEVs. B, BEAS-2B sEVs. For MRPS, average particle concentrations from three independent samples per group are shown. C, size distribution of sEVs determined from TEM images (N = 50 per group). ∗∗∗∗p < 0.0001. sEV, small extracellular vesicle.
Proteomics
In the DIA label-free proteomic experiments, a total of 2334 proteins were identified and quantified from 12 sEV (6 A549 and 6 BEAS-2B) and 6 (3 F12 and 3 BEGM) media samples. Quantified proteins were compared with the 100 most frequently identified exosome proteins listed in ExoCarta (59), and each individual sample overlapped 77 to 91 hits from the list, demonstrating the good quality of sEVs isolated by SEC. Among the quantified proteins, 1739 were identified with at least two peptides, which were considered for further analysis. A total of 1528 proteins were detected in at least three replicates in at least one sEV group, but 583 of these proteins were also identified in at least two corresponding culture media samples, indicating potential bias in the observed expression levels in sEV samples. Therefore, only the remaining 945 proteins were included for statistical analysis, of which 408 were found to have different abundances between the two sample groups. Among them, 313 proteins were upregulated in A549, whereas 95 were downregulated. For example, top upregulated proteins were alpha-fetoprotein and aldehyde dehydrogenase 3 family member A, whereas top downregulated proteins were pentraxin-related protein PTX3 and fibulin-2. The distribution of fold changes (FCs) and adjusted p values is visualized on the volcano plot (Fig. 4A). The full list of statistical results for the proteomic analysis can be found in Supplemental Table S8. Eleven PG core proteins were tested for expression differences, of which seven were differentially expressed, carrying CS, heparan sulfate (HS), and/or keratan sulfate (KS) chains: versican core protein (CS), chondroitin sulfate proteoglycan 4 (CSPG, CS), aggrecan core protein (CS/KS), testican-1 (HS/CS), syndecan-4 (HS/CS), collagen alpha-1(XVIII) chain (HS), and mimecan (KS). Expression differences of the five CS-bearing PGs are shown in Figure 4B.Fig. 4Proteomic differences between A549 and BEAS-2B sEVs. A, volcano plot of all quantified proteins; blue indicates proteins significantly under-represented, and red indicates proteins significantly over-represented in A549 sEVs. B, boxplots of differentially expressed CSPGs. ∗p < 0.05, ∗∗p < 0.01, and ∗∗∗p < 0.001. CSPG, chondroitin sulfate proteoglycan; sEV, small extracellular vesicle.
PCA was performed on the 945 proteins included in statistical analysis (Fig. 5A), whereas hierarchical clustering was performed on the 408 differentially expressed proteins, and a heatmap was generated (Fig. 5B). PCA revealed complete separation of A549 and BEAS-2B sEVs along principal component 1 (PC1, 42.31% variance explained), with samples clustering by cell line, indicating markedly different proteomic profiles between sEVs derived from A549 and BEAS-2B cell lines.Fig. 5Global proteomic profiling of sEVs. A, PCA of all quantified proteins. B, heatmap with hierarchical clustering of differentially expressed proteins. C, gene-concept network derived from GSEA of Gene Ontology Biological Process terms; red nodes represent biological processes, and blue nodes represent genes. GSEA, gene set enrichment analysis; PCA, principal component analysis; sEV, small extracellular vesicle.
Dysregulated processes between A549 and BEAS-2B sEVs were identified by GSEA and ranked based on their normalized enrichment scores (NESs) and adjusted p values (Supplemental Table S9). The top enriched process was negative regulation of cell cycle processes (NES = 1.75, p = 0.0321). Other highly ranked processes include cell cycle checkpoint signaling, cellular nitrogen compound metabolism, and regulation of DNA metabolic processes, all with NES values above 1.7. In addition, processes related to nucleic acid and RNA metabolism, translation, and biosynthesis were significantly enriched. In contrast, negatively enriched processes include immune response, immune effector processes, and positive regulation of endocytosis, indicating downregulation in A549 sEVs compared with BEAS-2B sEVs. Figure 5C depicts a network illustrating the relationships between genes and the top Gene Ontology biological process terms.
Glycomics-Guided N-Glycoproteomics
N-glycomics analysis identified a total of 271 N-glycan compositions, with multiple potential structures matching a single measured mass counted as a single composition (Supplemental Table S10). All assignments met stringent curation criteria, including mass tolerance, ppm error, isotopic pattern quality, signal-to-noise thresholds, and consistency with expected migration order. Species bearing phosphorylation or high levels of fucosylation were observed; however, they generally occur at low abundance and are therefore unlikely to be detected in glycoproteomic analyses. In contrast, only a limited number of N-glycans (13 and 5) were observed in the culture media controls, indicating that the sEV-derived N-glycan profiles are not attributable to background contamination.
N-glycoproteomic analysis quantified 1659 N-glycopeptides, corresponding to 793 unique peptides from 162 proteins, across 18 samples (6 A549 sEV, 6 BEAS-2B sEV, 3 F12 media, and 3 BEGM). Following consolidation of glycopeptides sharing the same glycan structure at identical sites, exclusion of glycoforms detected in fewer than three replicates or present in at least two media controls, and removal of assignments with unknown glycosylation sites, 227 glycoforms were retained for statistical analysis. These glycoforms are mapped to 117 N-glycosylation sites across 72 proteins. Of these, 152 glycoforms (belonging to 85 glycosylation sites) were differentially represented between the two groups, with 92 over-represented and 60 under-represented in A549 sEVs. FCs and p values are visualized in Figure 6A. Notably, eight glycoforms of the versican CSPG core protein were differentially abundant. At glycosylation site 1898, three fucosylated complex N-glycans were downregulated in A549 sEVs, whereas at glycosylation site 330, two similar structures were upregulated. The complete list of statistical results is provided in Supplemental Table S11.Fig. 6N-glycoproteomics profiling of sEVs. A, volcano plot of all quantified N-glycopeptide glycoforms; blue indicates glycoforms significantly under-represented, and red indicates glycoforms significantly over-represented in A549 sEVs. B, PCA analysis of all quantified N-glycoforms. C, heatmap with hierarchical clustering, generated from differentially represented N-glycoforms. PCA, principal component analysis; sEV, small extracellular vesicle.
PCA performed on all glycoforms subject to statistical analysis revealed complete separation between A549 and BEAS-2B sEVs along PC1 (54.2% variance explained, Fig. 6B). Consistently, heatmap analysis of differentially represented glycopeptides showed clear clustering by cell line (Fig. 6C), confirming distinct N-glycoproteomic profiles between the two sEV populations.
To identify the major glycosylation features underlying these differences, N-glycans were classified according to glycan type (complex, hybrid, or oligomannose) as well as rates of fucosylation and sialylation (Fig. 7, Supplemental Table S12). Considering both sEV types, complex-type N-glycans were the most abundant (96.4% on average), whereas hybrid and oligomannose N-glycans accounted for 1.2% and 2.4%, respectively. In A549 sEVs, the relative abundance of complex N-glycans increased by 2% (p < 0.05), accompanied by a significant decrease in oligomannose (FC = 0.54, p < 0.05) and a slight decrease in hybrid (FC = 0.62, not significant) type N-glycans. In addition, a nonsignificant trend toward higher levels of fucosylation in A549 sEVs (FC = 1.14) was observed, whereas sialylation showed a modest, nonsignificant decrease (FC = 0.97).Fig. 7Global N-glycosylation features of sEVs. Boxplots showing (A) the relative abundance of N-glycan types (complex, hybrid, and oligomannose). B, the rates of fucosylation and sialylation. ∗p < 0.05. sEV, small extracellular vesicle.
Changes in glycopeptide levels can result from alterations in both protein expression and glycosylation patterns, or both. For example, galectin-3-binding protein (LG3BP) N551 F1H7N6S2 (indicating one fucose, seven hexose, six N-acetylhexosamines, and two sialic acid units) has an FC of 0.07, whereas a different glycoform at the same site (F1H5N4S1) exhibited an FC of 5.7. As LG3BP protein abundance itself was strongly reduced (FC = 0.06), the decrease in the former glycoform is consistent with protein-level changes, whereas the increase in the latter indicates glycan-level changes. To disentangle glycosylation effects from protein abundance, glycopeptide FCs were normalized to the corresponding protein-level FC derived from proteomics (Supplemental Table S13). Normalized glycoform FCs are visualized in Figure 8, where small bubbles indicate that the observed glycopeptide change was caused by a change in protein amount, whereas larger bubbles indicate that a change in glycan structure occurred. Across several laminin proteins, including laminin subunit α2 (LAMA2), laminin subunit α5, and laminin subunit γ1, glycosylation changes were predominantly characterized by increased proportions of specific glycan structures. For example, at LAMA2 N1810, four nonfucosylated glycoforms showed highly increased relative abundance. In contrast, prothrombin and tissue factor pathway inhibitor 2 exhibited an overall decrease in the rate of N-glycosylation. Mixed patterns were observed for certain proteins, such as versican (CSPG2), where N-glycan proportions decreased at position N1898, whereas they increased at the other three analyzed sites.Fig. 8Bubble plot visualizing protein-normalized fold changes of N-glycoforms. Red bubbles indicate increased relative glycosylation levels, and blue bubbles indicate decreased relative glycosylation levels. Only glycans associated with at least five glycosylation sites and glycosylation sites associated with at least two glycans are shown.
CS/DS Analysis
In CS/DS disaccharide analysis, no disaccharides were detected in either cell culture media, confirming that the CS/DS disaccharides measured in the samples were derived from sEVs. The disaccharide amounts in femtomoles quantified across the 12 EV samples, as well as their group-wise means and standard deviations, are provided in Supplemental Table S14. In terms of the relative amounts of each component, the ratio of nonsulfated D0a0 (FC = 0.65) and monosulfated D0a6 (FC = 0.76) decreased in A549 sEVs compared with BEAS-2B sEVs, whereas that of monosulfated D0a4 (FC = 1.59) and disulfated D0a10 (FC = 1.23) increased (Fig. 9A). Among the relative amounts of the four disaccharides, the change in the two monosulfated components was found to be statistically significant. These changes resulted in a slight increase in the average rate of CS/DS sulfation (FC = 1.08, not significant, Fig. 9B), whereas the 6S/4S ratio was significantly decreased in A549 sEVs (FC = 0.48, Fig. 9C), suggesting altered sulfotransferase activity between the two cell line–derived sEV populations. In addition to compositional differences, A549 sEVs contained substantially higher CS/DS levels, with an average of 3.4-fold more disaccharides detected per sample compared with BEAS-2B sEVs (Fig. 9D). Statistical results for all calculated and derived parameters are summarized in Supplemental Table S15.Fig. 9CS/DS disaccharide composition of sEVs. Boxplots showing (A) relative abundances of individual CS/DS disaccharides (%), (B) average CS/DS sulfation rate, (C) 6S/4S ratio, and (D) total CS/DS disaccharide content (fmol). ∗p < 0.05, ∗∗p < 0.01, and ∗∗∗p < 0.001. CS, chondroitin sulfate; DS, dermatan sulfate; sEV, small extracellular vesicle.
CS/DS disaccharide profiles were further evaluated using hierarchical clustering and PCA based on both relative (Fig. 10) and absolute abundances (Supplemental Fig. S2). PCA of relative CS/DS composition revealed clear separation between A549 and BEAS-2B sEVs along PC1 (63.2% variance explained) and PC2 (29.04% variance explained). Heatmap analysis showed clustering of D0a4 with D0a10 and of D0a0 with D0a6 disaccharides, whereas samples clustered according to cell line origin.Fig. 10Multivariate analysis of CS/DS disaccharide profiles. A, PCA based on relative CS/DS disaccharide abundances. B, heatmap with hierarchical clustering of relative amounts of CS/DS disaccharides. CS, chondroitin sulfate; DS, dermatan sulfate; PCA, principal component analysis.
Discussion
In the current study, we characterized the proteomic, N-glycoproteomic, and CS/DS disaccharide profiles of sEVs derived from A549 and BEAS-2B cells. TEM and MRPS analysis confirmed the presence of particles <200 nm in size, and proteomic analysis confirmed the presence of common EV proteins, verifying the quality of sEVs.
Proteomics
The final proteomic dataset of 945 proteins allowed robust statistical analysis; however, the presence of an additional 583 proteins in at least two media samples highlights the challenge of background contamination, confirming the need for strict filtering criteria in proteomic analysis of cell culture–derived EVs. A high proportion of proteins (408 of 945) was found to be differentially expressed, suggesting pronounced differences between sEVs derived from the two cell lines.
Previous studies have shown that the protein cargo of EVs largely reflects the proteomic state of their parental cells, including cancer-associated alterations (60). Several proteins dysregulated in A549 sEVs in the present study have also been reported to change in the same direction in proteomic analyses of cancer tissue, plasma, or EV samples. Fibronectin is a major extracellular matrix (ECM) protein involved in cell adhesion and matrix organization, and its upregulation has been described in plasma-derived EVs from breast cancer patients (61). Proliferating cell nuclear antigen is a well-known marker associated with cell proliferation, and its expression was found to be elevated in lung cancer tissue (62). Among the proteins downregulated in A549 sEVs, fibulin-2 is an ECM protein involved in cell adhesion and tissue organization that has been shown to be dysregulated in various cancers (63). Analyzing lung cancer cell lines, fibulin-2 was downregulated in 9 of 11 cell lines compared with normal bronchial epithelial cells (64). The detection and relative enrichment of these proteins in sEVs derived from A549 and BEAS-2B cells indicate the suitability of this cell line–based system for comparative EV proteomic analyses.
CSPG analysis identified alterations in both ECM PGs (versican, aggrecan, and testican-1) and cell surface PGs (CSPG4, syndecan-4), which may reflect differences in ECM-associated and cell membrane–associated PG signaling in A549 and BEAS-2B sEVs (65). CS, HS, and KS containing PGs were all affected, suggesting that all these GAG classes are worth investigating. The largest increase in A549 sEVs (FC = 14.3) was observed for testican-1, which has previously been shown to be upregulated in several cancer types, including lung cancer (66) and has been confirmed to be present in EVs (67). The largest decrease (FC = 0.083) was observed for syndecan-4, a protein that is upregulated or downregulated in a cancer type–dependent manner (68) and plays a key role in cell adhesion, migration, and signal transduction.
GSEA revealed several dysregulated processes, which can be mostly associated with cell cycle regulation (e.g., negative regulation of cell cycle process, cell cycle checkpoint signaling), DNA repair (e.g., signal transduction in response to DNA damage, DNA damage response), metabolism (e.g., RNA metabolic process, nucleic acid metabolic process), protein synthesis (e.g., translation, peptide biosynthetic process), and immune response (e.g., immune response, immune effector process). Mostly upregulated processes were identified, whereas immune-related pathways showed negative enrichment. This pattern may partially reflect the higher number of upregulated proteins compared with downregulated ones in the dataset. Overall, the functional categories identified in this analysis emphasize differences in the functional composition of sEV protein cargo between the two cell lines and overlap with pathways that are frequently reported in cancer-related proteomic studies.
Glycomics-Guided N-Glycoproteomics
In N-glycoproteomics, 227 glycoforms were analyzed, of which 152 were differentially represented between A549 and BEAS-2B sEVs. The observed profiles allowed for complete separation of the two groups in PCA, indicating major differences in the composition of sEV-associated N-glycoforms. Such alterations in glycoform representation may be influenced by changes in the activity or regulation of glycosyltransferases and glycosidases, which are frequently reported to be altered in cancer (31, 69). In line with this, altered expression of specific glycosyltransferases was associated with cancer patient prognosis, highlighting the relevance of glycosylation-related pathways in tumor biology (70).
N-glycoproteomics and N-glycomics studies of EVs are very limited (71), with the scarce literature on EV glycoproteomics focusing mostly on urine (72) and blood plasma (73). In lung cancer, a previous study compared N-glycomic patterns on sEVs from small cell lung cancer and NSCLC cells and found that the N-glycans of small cell lung cancer-sEVs are fairly heterogeneous, whereas NSCLC-sEVs contain primarily core-fucosylated, biantennary, and triantennary N-glycans (74).
In cancer, complex N-glycans, particularly highly branched structures, are often upregulated because of the overexpression of glycosyltransferases (69). We observed an increased ratio of complex N-glycans and a decreased ratio of oligomannose N-glycans in A549 sEVs compared with BEAS-2B sEVs. These differences may reflect cell-line–specific glycan processing, for example, variations in glycosyltransferase expression or activity.
Fucosylation plays an important role in various biological processes, including cell signaling, adhesion, and immune modulation (75). Core fucosyltransferase (FUT8), which mediates core fucosylation, is frequently upregulated in cancer cells (76), including lung cancer (77). In our study, A549 sEVs showed a trend toward increased fucosylation compared with BEAS-2B sEVs, although this difference did not reach statistical significance.
We characterized some glycoproteins that are highly glycosylated, and several of their glycoforms were dysregulated between A549 and BEAS-2B sEVs, for example, CSPG2, LG3BP, and laminins. CSPG2 is an ECM protein that plays a role in cell adhesion and migration and has been implicated in cancer-associated processes in previous studies (78). In our study, we identified several dysregulated CSPG2 glycoforms, including both upregulations and downregulations. At glycosylation site N1898, three fucosylated, multiantennary glycoforms were under-represented, whereas at three different positions, a total of five fucosylated, biantennary glycoforms exhibited over-representation in A549 sEVs.
LG3BP is a highly glycosylated protein known to be enriched in EVs and associated with cancer-related processes, including immune modulation and metastasis (79, 80). A total of 21 differentially represented LG3BP glycoforms (at N69, N398, and N551) were detected in our study, of which 20 were under-represented in A549 sEVs. Compared with the LG3BP protein from proteomics, the site-specific glycoforms showed mixed patterns, with roughly half exhibiting higher relative abundance and half lower, indicating that glycosylation changes are not strictly proportional to protein-level changes.
Laminins are key ECM components involved in cell adhesion and basement membrane organization and have been linked to cancer-associated remodeling processes in previous studies (81, 82). Each of the laminin subunits, LAMA2, laminin subunit α5, and laminin subunit γ1, were found to possess dysregulated glycoforms. For example, LAMA2 at glycosylation site N1810 showed the upregulation of several sialylated, biantennary, and triantennary structures.
The site-specific N-glycoform differences observed in CSPG2, LG3BP, and laminin subunits highlight the heterogeneity of sEV glycosylation between the two cell lines. We found that glycoform changes were not strictly proportional to differences in protein levels, emphasizing the importance of site-specific analysis. While the present study focused on profiling site-specific glycoforms, future research may examine the functional consequences of these modifications to determine their potential role in sEV interactions and signaling.
CS/DS Analysis
For the first time, we performed MS-based disaccharide-level analysis of CS/DS chains in sEVs derived from lung epithelial cells. CS/DS disaccharide analysis revealed the increased amount of CS/DS chains in A549 sEVs compared with BEAS-2B sEVs (FC = 3.4). Increased amounts of CS/DS have been reported in tumor tissues from several cancer types, including liver (37), prostate (36) and lung cancer (83). The average rate of CS/DS sulfation slightly increased (FC = 1.1), and we observed a marked difference in the 6S/4S ratio (FC = 0.48).
MS-based research on GAG analysis of EVs remains extremely limited. So far, only a specific GC–MS-based technique has been applied, which breaks all glycopolymers, including GAGs, into their constituent saccharide units. Using this method, it was demonstrated that EVs derived from melanoma cells with or without brain metastasis contain different amounts of hyaluronan (HA) (84), and that EVs derived from plasma have different glycan profiles from whole plasma and are enriched in CS, DS, and KS GAGs (85). However, disaccharide-level characterization of CS/DS in EVs has not been reported previously.
Although EV-level CS/DS disaccharide data are lacking, the CS/DS profile has been extensively characterized in lung cancer tissues, providing a relevant biological framework for interpreting our findings. In a comprehensive study of CS/DS, HS, and HA, the cancer tissue samples contained over twice as much CS/DS as did the normal tissue samples, and the 6S/4S ratio greatly increased (FC = 2.0), whereas the amounts of HS and HA were not significantly different (35). Examining the CS/DS characteristics of tumor and adjacent normal regions from patients with different types of lung cancer, the total amount of CS/DS disaccharides was higher (FC = 2.2) in tumor than in adjacent normal regions; the relative amount of D0a0 decreased (FC = 0.75), whereas the amount of monosulfated components (D0a4, FC = 4.1 and D0a6, FC = 2.3) increased in tumor (83). This resulted in an increase in the average CS/DS sulfation (FC = 2.4) and a decrease in the 6S/4S ratio (FC = 0.83). In ALK-rearranged lung adenocarcinoma tissues, total CS/DS increased by 2.5- to 4.4-fold depending on the sample type, whereas average sulfation increased by 4.0- to 4.7-fold, and 6S/4S sulfation showed variability but was mostly increasing (86). Compared with these tissue-based studies, sEVs derived from A549 also showed a significant increase in total CS/DS content compared with BEAS-2B sEVs, although differences in sulfation patterns were not entirely consistent with observations in tissues.
Changes in the sulfation pattern have been reported to influence cell signaling pathways and were associated with cancer progression and metastasis in previous studies (87). The decreased 6S/4S ratio observed in A549 sEVs may reflect altered regulation of chondroitin 4-O-sulfotransferase enzymes, including carbohydrate sulfotransferase 11. The overexpression of this gene has been associated with unfavorable prognosis in the case of liver (88), pancreatic (89), and lung (90) cancer.
Oncofetal CS (ofCS) is a unique GAG structure that has been associated with both fetal development and cancer (91). Using VAR2CSA-based binding approaches, ofCS has been detected on tumor-derived sEVs from multiple cancer cell lines, including A549, as well as in plasma samples from cancer patients (92, 93). Although ofCS was not directly assessed in the present study, previous reports linking ofCS to increased 4-O-sulfation (94) are consistent with the elevated 4-O-sulfation observed in A549 sEVs.
Conclusions
This study provides an integrated and systematic characterization of the proteome, N-glycoproteome, and CS/DS GAG landscape of sEVs derived from A549 lung adenocarcinoma and BEAS-2B nontumorigenic cells. While proteomic profiling of sEVs is well represented in the literature, glycosylation and GAG-level features have remained largely underexplored. By addressing these complementary molecular layers in parallel, our work expands the current understanding of sEV composition and heterogeneity.
Across all analytical dimensions, sEV profiles robustly reflected cellular origin, as demonstrated by consistent separation of A549 and BEAS-2B sEVs using hierarchical clustering and PCA. Proteomic analysis revealed significant dysregulation of five CSPGs, including testican-1 and syndecan-4, which have been implicated in cell adhesion, migration, and tumor progression. At the N-glycosylation level, A549 sEVs exhibited a higher relative abundance of complex N-glycans, accompanied by reduced abundance of oligomannose structures. This pattern is consistent with glycosylation changes commonly observed at the cellular level. CS/DS analysis revealed a 3.4-fold increase in total CS/DS disaccharide content and a significantly altered 6S/4S ratio in A549 sEVs, indicating shifts in sulfation pattern that may influence sEV interaction with their molecular environment.
Together, these findings demonstrate that proteomic, N-glycoproteomic, and GAG-level analyses provide complementary and nonredundant information, underscoring the value of an integrated multilayered approach for sEV characterization. As glycan structures are determined by the activity of specific enzymes, such as fucosyltransferases, N-acetylglucosaminyltransferases, and sulfotransferases, future studies integrating transcriptomic or proteomic profiling of glycosylation machinery will be critical for elucidating the mechanistic basis of the observed alterations. Building on the analytical framework established here, future work will extend this approach to additional cancer cell models and, ultimately, to plasma-derived sEVs, supporting the development of more informative glyco-focused EV biomarkers.
Data Availability
The data of the proteomic and N-glycoproteomic measurements are available in the MassIVE repository under the https://doi.org/10.25345/C51834F3N link and can be downloaded via FTP (ftp://massive-ftp.ucsd.edu/v09/MSV000097305/). The GAG-omics and N-glycomics data presented in this study have been deposited in the GlycoPOST database (95) under the accession numbers GPST000562 and GPST000667, respectively. R codes associated with the data analyses are published to GitHub (https://github.com/balbisimirjam/A549_BEAS-2B_EV.git).
Supplemental Data
This article contains supplemental data.
Conflict of Interest
The authors declare no competing interests.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Bray F.Laversanne M.Sung H.Ferlay J.Siegel R.L.Soerjomataram I.Jemal A.Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries CA Cancer J. Clin.7420242292633857275110.3322/caac.21834 · doi ↗ · pubmed ↗
- 2Babar L.Modi P.Anjum F.Lung Cancer Screening, in Stat Pearls 2025 Stat Pearls Publishing Treasure Island (FL)30725968 · pubmed ↗
- 3Tan A.C.Tan D.S.W.Targeted therapies for lung cancer patients with oncogenic driver molecular alterations J. Clin. Oncol.4020226116253498591610.1200/JCO.21.01626 · doi ↗ · pubmed ↗
- 4Pinheiro F.D.Teixeira A.F.de Brito B.B.da Silva F.A.F.Santos M.L.C.de Melo F.F.Immunotherapy - new perspective in lung cancer World J. Clin. Oncol.1120202502593272852810.5306/wjco.v 11.i 5.250PMC 7360520 · doi ↗ · pubmed ↗
- 5Mullen S.Movia D.The role of extracellular vesicles in non-small-cell lung cancer, the unknowns, and how new approach methodologies can support new knowledge generation in the field Eur. J. Pharm. Sci.188202310651610.1016/j.ejps.2023.10651637406971 · doi ↗ · pubmed ↗
- 6Cambier M.Extracellular vesicles (E Vs) as diagnostic tools in the phenotypic determination of lung tumors Eur. Respir. J.60suppl 6620221370
- 7Maacha S.Bhat A.A.Jimenez L.Raza A.Haris M.Uddin S.Grivel J.C.Extracellular vesicles-mediated intercellular communication: roles in the tumor microenvironment and anti-cancer drug resistance Mol. Cancer 182019553092592310.1186/s 12943-019-0965-7PMC 6441157 · doi ↗ · pubmed ↗
- 8Zaborowski M.P.Balaj L.Breakefield X.O.Lai C.P.Extracellular vesicles: composition, biological relevance, and methods of study Bioscience 6520157837972695508210.1093/biosci/biv 084PMC 4776721 · doi ↗ · pubmed ↗
