Transgenic Tobacco as a Bioreactor for the Production of Bioactive and Triple-Helical Recombinant Type III Collagen
Tairu Wu, Weisong Pan, Jiahao Pan, Yahui Wu, Wai Chin Li, Eric Po Keung Tsang, Chuan Wu

TL;DR
Scientists used transgenic tobacco plants to produce bioactive type III collagen, which could be useful for regenerative medicine.
Contribution
This is the first report of triple-helical recombinant type III collagen production in transgenic tobacco plants.
Findings
Transgenic tobacco plants coexpressing COL3A1 and modification enzymes produced thermally stable rhCOL3.
Plant-derived rhCOL3 has a triple-helix structure and biological activity.
Propeptides of rhCOL3 were correctly cleaved by coexpressed enzymes.
Abstract
Collagen is the primary protein in the extracellular matrix of human cells and the body and is essential for cell structure and function. Here, for the first time, we report a method for producing recombinant triple-helical collagen type III (rhCOL3) in transgenic tobacco as a bioreactor. We constructed a pMDV-COL3A1 vector containing the human type III collagen gene COL3A1, as well as a pMDV-COL3A1:5E vector that coexpressed COL3A1 and the enzymes required for its posttranslational modification. These two vectors were used to transform tobacco genetically. The COL3A1 gene was successfully coexpressed in tobacco plants with four enzymes that promote its posttranslational modification. The transcriptional level of COL3A1 in the transgenic lines coexpressing posttranslational modification genes was greater than that in the transgenic lines expressing only COL3A1. The enzyme-modified…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7- —Hunan Natural Science Fund for Distinguished Young Scholar, China
- —Advanced Interdisciplinary Project in Central South University
- —Central South University Graduate School-Enterprise Joint Innovation Project
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTransgenic Plants and Applications · Collagen: Extraction and Characterization · Polysaccharides and Plant Cell Walls
1. Introduction
The primary protein of the extracellular matrix in humans, collagen, comprises one-third of the body’s protein and three-quarters of the skin’s dry weight. All collagens play crucial roles in the connections among skin, bones, and joints, as well as in wound healing, cell signalling, and tissue repair [1]. Type III collagen forms collagen fibrils and provides the stiff, resilient properties of many tissues. The skin and vascular system contain the majority of type III collagen. The three chains of type III collagen that form the triple helix structure are the same, all being α1 chains. A repeating peptide trio (Gly–Xaa–Yaa) makes up each α1 chain in the triple helix structure. The nascent collagen α-chain (pro-α chain) of collagen is subsequently modified by prolyl-4-hydroxylase (P4H) and lysine hydroxylase (LH) in the endoplasmic reticulum, hydroxylating the proline residue or lysine residue at the Yaa position, respectively, while a portion of the hydroxylated lysine residue is further glycosylated [2]. Proline hydroxylation and lysine hydroxylation are necessary to stabilize the triple helix structure. The C-terminal propeptides of the three pro-α chains form disulfide bonds under the synergistic action of calnexin and protein disulfide isomerase (PDI) and trimerize the single pro-α chain of collagen to form a triple helix conformation, which persists from the C-terminus to the N-terminus [3]. With the assistance of the collagen-specific molecular chaperone HSP47, the Golgi apparatus transports procollagen to the extracellular matrix. Through the enzymatic hydrolysis of PCP and PNP, the excess N-terminal and C-terminal propeptides of procollagen are removed, resulting in the formation of a complete mature collagen molecule [4].
Introns are noncoding sequences in genes that are spliced out during the transcription of genes into mRNA and, therefore, are not directly involved in protein coding. Nevertheless, many studies have demonstrated that introns are more than just “junk DNA”. They can increase the accumulation of mRNA through an unknown “intron-mediated enhancement (IME)” mechanism in some cases [5]. In terms of the effects of introns on gene expression, the number of mRNAs produced by the constructs containing introns is 10 to 100 times greater than that produced by the constructs without introns in transgenic mice [6]. A construct containing the TDH3 promoter followed by the RPS25A intron showed an approximately 50-fold higher expression level than TDH3 promoter alone in Saccharomyces cerevisiae, indicating that introns play a role in enhancing protein expression [7]. Callis et al. reported that introns can increase the transcription of Adh1 in maize [8]. Clancy et al. demonstrated that the splicing of the first intron of the Sh1 gene is essential for enhancing gene expression [9]. Ramona et al. added 12 introns to the Cas9 gene and increased the expression level of this protein in Nicotiana benthamiana and Arabidopsis, thereby significantly enhancing the efficiency of gene editing [10].
With respect to type III recombinant collagen, a study has shown that properly hydroxylated human collagen III α1 chains can be produced in Pichia pastoris GS115. As human collagen α1(III) chains have no N-terminal propeptide or C-terminal propeptide, they cannot form tri-helices [11]. Moreover, there is no evidence of its activity. Recently, a study achieved high-level secretory expression of recombinant human-like type III collagen α1 in Pichia pastoris via multilevel systematic optimization. A significant increase in the yield of rhCOL3 to 10.3 g/L was achieved in a 5-L fermenter. However, the lyophilized rhCOL3 samples showed distinct curly, flocculent, and helically entangled structures and exhibited certain spongy features [12]. Given the above research progress, a completely new approach is needed to produce type III collagen recombinant collagen.
Plant molecular farming has received extensive attention because it has demonstrated great potential as a more low-cost and efficient method for the large-scale production of medicines. In terms of unique advantages over microbial and mammalian bioreactors in synthetic biology, plant platforms enable scalable, cost-effective production of complex biomolecules without the need for expensive fermentation infrastructure. Tobacco, as a promising bioreactor, has been utilized for the production of various natural bioactive compounds and medicinal proteins. Fang et al. engineered tobacco for efficient astaxanthin production using a linker-free monocistronic dual-protein expression system [13]. Forskolin was successfully produced in transgenic Nicotiana tabacum by expression of six biosynthetic genes [14]. The B subunit of the vaccine antigen cholera toxin was expressed and assembled as functional oligomers in transgenic tobacco chloroplasts [15]. High-affinity FLAG antibody was recombinantly expressed in Nicotiana benthamiana and purified [16]. One study utilized transgenic tobacco to produce glucagon-like peptide-1, which is a small peptide hormone with potent insulinotropic activity, in transgenic plants [17]. Ruggiero et al. initially used Nicotiana tabacum as a bioreactor for the production of human homotrimeric type I collagen in 2000 [18]. By coexpressing human procollagen α1(I) with P4H, recombinant hydroxylated homotrimeric type I collagen has been successfully expressed in tobacco [19]. These results demonstrate that recombinant human type I procollagen α1(I) chains can be expressed in tobacco and form disulfide-bonded, stable triple helices. The expression level in fresh leaves can reach 20 mg/kg. In 2009, the Israeli regenerative medicine company CollPlant obtained biological recombinant human type I collagen with an expression level of up to 200 mg/kg fresh leaves in tobacco by successfully coexpressing genes encoding the human type I collagen α1- and α2-chains P4H and PLOD3 [20]. Therefore, the expression of type III collagen in tobacco represents a promising approach.
In this study, we used stable-transformed Nicotiana tabacum as a bioreactor to produce type III collagen while coexpressing the enzymes that assist in its posttranslational modification. We inserted multiple introns into the coding sequence of human type III collagen to increase its transcriptional efficiency. We constructed two constructs (one containing five enzymes derived from mammals and the other lacking these five enzymes). We transformed both into tobacco to compare the effects of these enzymes on collagen expression. These five enzymes are P4H, LH, PCP, PNP, and FTO (human RNA demethylase). Apart from the four enzymes that play a role in assisting the posttranslational modification of collagen and maintaining the stability of the triple helix structure, FTO is a human RNA demethylase that can induce more open chromatin and transcriptional activation in plants, which may result in increased transcriptional levels of collagen [21]. The recombinant type III collagen expressed in tobacco was purified and identified in this study. Structural testing and cell experiments have demonstrated that rhCOL3 has a complete triple-helix structure and biological activity. It is involved in skin repair and can potentially provide active plant-derived type III collagen for medical regeneration. Our work not only establishes a scalable and sustainable plant-based platform for type III collagen production but also represents a paradigm for enabling the molecular farming of high-value exogenous protein.
2. Materials and Methods
2.1. Construction of Expression Vectors
All coding sequences were optimized for Arabidopsis thaliana preference using the tool provided by the www.jcat.de website and chemically synthesized. Each gene fused with a vacuolar-targeting signal (MAHARVLLLALAVLATAAVAVASSSSFADSNPIRPVTDRAASTLES) derived from the thiol protease aleurain (GenBank accession number: XP_044947634.1) was cloned and inserted into the pMDV expression vector. All the promoters and terminators used are listed in Table S1. The sequence of the plasmid and related information are presented in Supplementary Materials File S1. The long DNA sequence was divided into eight fragments, and multiple clones were generated by designing specific recognition sites for the BsaI restriction endonuclease.
2.2. Plant Transformation
The rhCOL3 expression vector was subsequently transformed into C58C1 Agrobacterium tumefaciens. Agrobacterium was cultivated in LB broth at 28 °C until its density reached an OD600 of 0.8, after which the Agrobacterium cells were resuspended in Murashige and Skoog (MS) liquid medium [22]. Aseptic tobacco leaf discs from 4-week-old plants were used for transformation. A local tobacco cultivar, ‘K326’, was transformed using an Agrobacterium-mediated transformation procedure described previously [23]. Tobacco callus and shoot induction medium: MS + 0.5 mg/L 6-BA + 30 mg/L hygromycin +200 mg/L timentin. Tobacco root induction medium: MS + 30 mg/L hygromycin +200 mg/L timentin. The cultivation temperature was 25 °C, and the light cycle was 16 LH/8 DH.
2.3. Total RNA Extraction and qPCR Experiments
Total RNA from the samples was extracted from the leaves of the transgenic tobacco plants using an RNA Extraction Kit (TIANGEN, Beijing, China) according to the manufacturer’s instructions. The extracted total RNA was reverse transcribed into cDNA using the TIANScript II cDNA First Strand Synthesis Kit (TIANGEN, Beijing, China) according to the manufacturer’s instructions. Gene expression levels were analysed by qPCR on an Applied Biosystems QuantStudio1 (Thermo Fisher Scientific, Waltham, MA, USA) using 2 × ChamQ Blue Universal SYBR qPCR Master Mix (Vazyme, Nanjing, China). PCR primers were designed using NCBI Primer BLAST (https://www.ncbi.nlm.nih.gov/tools/primer-blast/, accessed on 24 February 2026). The housekeeping gene Ntubc2 was used as the reference gene. The sequences of all the primers used in the qPCR are shown in Table S2. Quantitative real-time PCR was performed in a total volume of 20 μL containing 10 μL of 2× ChamQ Blue Universal SYBR qPCR Master Mix, 0.1 μM specific primers, and 100 ng of template cDNA.
2.4. ELISA
Fresh leaves (0.1 g) were rapidly ground on ice using 1 mL of PBS buffer containing 0.1 mM PMSF. The mixture was then centrifuged at 12,000 rpm for 15 min at 4 °C, and the supernatant was used for the detection of the target protein content. A human type III collagen ELISA kit (CSB-E04799h; CUSABIO, Shanghai, China) was used to measure the rhCOL3 content in the supernatant. The measurements were carried out in accordance with the manufacturer’s instructions.
2.5. Western Blotting (WB) Analysis
Total soluble proteins were extracted from 0.2 g of tobacco leaves by grinding them in 1 mL of NP-40 lysis buffer supplemented with the protease inhibitor PMSF to prevent protein degradation. After centrifugation, the supernatant was mixed with 5× protein denaturation loading buffer and heated to 100 °C for 7 min. Samples (20 µL) were separated on a polyacrylamide gel and then transferred to a PVDF membrane. The PVDF membrane was incubated in 5% skim milk for 1 h, after which the primary antibody was diluted 1:5000 in blocking solution and incubated overnight. The rhCOL3 protein was immunodetected using an anti-type III collagen rabbit polyclonal antibody (SAB4500367; Sigma-Aldrich, Darmstadt, Germany), which recognizes the N-terminal propeptide, and an anti-type III procollagen mouse monoclonal antibody (C7805; Sigma-Aldrich, Darmstadt, Germany), which recognizes mature type III collagen. Horseradish peroxidase-conjugated goat anti-rabbit IgG (Sigma-Aldrich, Darmstadt, Germany) was used as the secondary antibody. The type III collagen standard sample derived from humans used in this study was purchased from Sigma-Aldrich (C4407).
2.6. Protein Purification
The protein purification method was based on previous studies, with some modifications. Approximately 1 kg of tobacco leaves was blended in 2 L frozen extraction buffer (100 mM Tris-HCl, pH 7.5, containing 4.5 mM potassium methane sulfite and 7.5 mM EDTA) supplemented with 15 g of polyvinylpyridinone (PVPP). The crude extract was passed through a Miracloth filter (Millipore, Darmstadt, Germany). The rhCOL3 in the supernatant was precipitated by gradually adding NaCl to a final concentration of 3.13 M with continuous stirring. Afterwards, the solution was left at 4 °C for 8 h to allow for the pellet containing the target protein to form. The pellet containing type III human collagen was then collected by centrifugation (26,000× g, 2 h, 4 °C). The collected pellet was resuspended in 400 mL of solution containing 250 mM acetic acid and 2 M NaCl for 5 min and then centrifuged at 26,000× g for 40 min at 5 °C. The supernatant was discarded, and the precipitate was resuspended in 200 mL of 0.5 M acetic acid for 1 h at room temperature. The insoluble substances were removed by centrifugation at 16,000× g and 15 °C for 30 min. The supernatant was filtered through a Miracloth filter. Afterwards, rhCOL3 was precipitated by gradually adding NaCl to a final concentration of 3 M while stirring continuously. The solution was left at 4 °C for 8 h, after which the pellets containing rhCOL3 were collected by centrifugation at 26,000× g and 4 °C for 2 h. The above precipitation and particle dissolution steps were repeated once. The pellets were redissolved in 40 mL of 10 mM HCl. The solution was transferred to a dialysis bag (MWCO 30 kDa) and dialyzed against 4 L of 10 mM HCl (4 h, 4 °C). The rhCOL3 was then concentrated using a 15 mL PES ultrafiltration spin column tube to a final volume of 2 mL. To further improve the purity of the rhCOL3, we employed size-exclusion chromatography using Geldex^®^ 200 PG Resin (BioLink, Shanghai, China) with PBS (pH = 7.2).
2.7. Southern Blot
Approximately 80 μg of tobacco genomic DNA was digested with HindIII or EcoRI for 20 h and electrophoresed on a 1% agarose gel at 50 V for 15 h. After alkaline denaturation, the digested genomic DNA was transferred to a nylon membrane. The prehybridization process was carried out at 60 °C for 2 h. After the biotin-labelled probe was added to the hybridization solution, hybridization was performed at 60 °C overnight. After the membrane was washed, the biotin signal of the probe was detected. HRP-conjugated streptavidin was coupled to the biotin probe, after which an enhanced chemiluminescence (ECL) assay was performed.
2.8. High-Efficiency Thermal Asymmetric Interlaced PCR (TAIL-PCR)
Three nested primers on the transgenic vector are close to the LB boundary. In accordance with the method described by Liu and Whittier [24], the genome of tobacco 7TU-77 was amplified in three steps using the three nested primers and the arbitrary degenerate primer AD (5′-TG(A/T)GNAG(A/T)ANCA(G/C)AGA-3′) with 128-fold degeneracy. The products obtained after the three-round PCR amplification were subsequently sequenced.
2.9. Circular Dichroism Analysis
Recombinant collagen was dissolved in deionized water to 100 μg/mL, and CD measurements were carried out using a J-1500 spectrophotometer (JASCO, Tokyo, Japan) and a quartz cuvette with a diameter of 10 mm. Measurements were performed at a continuous rate of 10 nm/min over a wavelength range of 180 nm to 260 nm. Protein thermal transition spectra were obtained by monitoring the ellipticity of protein solutions (20 mM Tris–HCl, pH 7.0) at 220 nm as the temperature was increased from 4 to 60 °C at 5 °C/h.
2.10. Scanning Electron Microscopy (SEM)
rhCOL3 was dissolved in 0.1 M acetic acid to prepare a 1 mg/mL stock solution. PBS was added to adjust the pH to 7.2, and the mixture was incubated for one hour at 37 °C. Fibrils were collected by centrifugation. The fibrils were immersed in 2.5% glutaraldehyde at 4 °C overnight to prevent structural deformation. Ethanol gradient dehydration (30, 50, 70, 80, 90, 100%) was performed for 15 min at each concentration. Afterwards, the ethanol was replaced with liquid CO_2_ in a critical point dryer. The samples were gold-coated and scanned by a GeminiSEM 360 microscope (ZEISS, Oberkochen, Germany).
2.11. Amino Acid Analysis
The sample was hydrolysed in hydrochloric acid at 110 °C for 24 h. The amino acids were converted into stable derivatives using a Karios Amino Acid Kit (Waters Corporation, Milford, MA, USA). Chromatographic separation was performed using a Waters ACQUITY UPLC I-CLASS ultrahigh-performance liquid chromatography system. The chromatographic column used was a Waters UPLC HSS T3 (1.8 μm, 2.1 mm × 150 mm). The mobile phase consisted of phase A (water, 0.1% formic acid) and phase B (acetonitrile). A Waters XEVO TQ-S micro-quadrupole mass spectrometry system (Waters, Milford, MA, USA) was subsequently used for mass spectrometric analysis. The positive ion source voltage was 1.5 kV, the cone voltage was 20 V, the desolvation temperature was 600 °C, the desolvation gas flow rate was 1000 L/h, and the cone gas flow rate was 10 L/h. The peak areas were calculated using MassLynx quantitative software(V4.0), and the quantitative results were obtained through a standard curve.
2.12. Cytotoxicity Assay
Mouse L929 fibroblasts were purchased from STEM RECELL Ltd., Shanghai, China (STM-CL-6929). The number of passages for both cell types was less than 5. The cells were inoculated into 96-well plates at a density of 8 × 10^3^ cells per well. The plates were then incubated at 37 °C with 5% CO_2_ for 24 h. The culture medium was removed, and 100 mL of recombinant collagen at different concentrations was added. The same volume of culture medium was added to the control group. After 24 h of incubation, the supernatant was removed. Cell culture medium supplemented with 10% CCK-8 was added, and the cells were incubated for 2 h. Then, the absorbance was measured at 450 nm. Each treatment was carried out with three biological replicates.
2.13. Cell Adhesion Assay
Human fibroblasts were purchased from Oricell Ltd., Suzhou, China (HXXFB-00001). Recombinant collagen solution (0.5 mg/mL) was added to a 96-well cell culture plate; 10% heat-denatured BSA was used as a control. The plate was incubated at 4 °C for 24 h, after which the supernatant was removed. Then, 100 μL of L929 suspension was added, and the mixture was incubated at 37 °C for 6 h. Adherent cells were quantified by total DNA. The cells were freeze-thawed three times using ultrapure water to lyse them. To the cell lysate, 5 μg/mL Hoechst 33,258 was added, and the cells were incubated for 1 h in the dark. The absorbance values at an emission wavelength of 465 nm were measured using a microplate reader. Each sample was measured three times in parallel. Each treatment was carried out with three biological replicates.
2.14. Cell Migration Assay
Human fibroblasts were inoculated at a density of 2 × 10^5^ cells per well in a 24-well culture plate. After the cells were cultured at 37 °C and 5% CO_2_ for 24 h, a scratch was made in the 24-well plate using a 200-μL perpendicular pipette. The wells were then washed with PBS three times to remove the scraped cells. One hundred microlitres of sample solution was added, and the plate was placed in an incubator (37 °C, 5% CO_2_) for another 24 h. The liquid was discarded, and each cell group was photographed using an inverted microscope. The average scratch area was calculated using Image-Pro Plus 6.0 software (Media Cybernetics, Rockville, MD, USA).
The healing rate was calculated as Healing rate (%) = (initial scratch area − current scratch area)/initial scratch area × 100%. Each treatment was carried out with three biological replicates.
3. Results
3.1. Generation of the Two Expression Vectors and the Generation of Transgenic Plants
The expression vectors constructed in this study are shown in Figure 1a. The full-length COL3A1 gene encoding human type III collagen is carried by the COL3A1 construct, while the COL3A1:5E construct contains the COL3A1 gene, mouse P4Hα and P4Hβ genes, and the pig LH gene and the human PNP and PCP genes. We introduced signal peptides at the beginning of COL3A1, P4Hα, P4Hβ, LH, PNP, and PCP, leading them to accumulate in vacuoles. Since FTO acts on mRNA, it was targeted to accumulate in the cytoplasm. We inserted 15 introns from Arabidopsis thaliana at the appropriate positions of the CDS of COL3A1 (Figure 1b). The insertion sites of the introns followed the GU-AG rule (Chambon rule), with an intron inserted at approximately every 200–400 base pairs on the CDS at the GU-AG site. If there were no GU-AG sites, the bases would be replaced by GU-AG pairs according to the principle of codon degeneracy. Transformants of COL3A1 and COL3A1:5E were selected for hygromycin B resistance. Among the 31 regenerated plants transformed with COL3A1:5E, 22 were confirmed to be transgenic positive; among the 52 regenerated plants transformed with COL3A1, 44 were confirmed to be transgenic positive. We predicted splice site cleavage using NetGene2 (https://services.healthtech.dtu.dk/services/NetGene2-2.42/) for the introns that we inserted into the coding sequence. The results revealed that the confidence values of the splice sites we designed were all above 0.85 (Figure 1c). To verify whether the exons of the COL3A1-encoding sequence in the COL3A1- and COL3A1:5E-positive plants were correctly spliced, RNA was extracted and then reverse transcribed into cDNA for PCR verification. We designed two pairs of primers that span introns at both the 5′ and 3′ ends of the COL3A1-encoding sequence (Figure S1a). We detected intron cleavage in 22 lines of COL3A1:5E and 44 lines of COL3A1. The results showed that the introns of most plants could be accurately spliced, producing bands of the correct size. A few lines showed no bands (Figure S1b–e). This might have occurred because the transcript amount was relatively low and thus was not amplified. The PCR-amplified products were subsequently sequenced, which revealed that the introns could be completely and accurately spliced (Figure 1c).
3.2. The Expression Level of COL3A1 Increased in the Plants Coexpressing These Enzymes
To understand the effects of these posttranslational modification enzymes on the expression level of COL3A1 in tobacco, we used quantitative real-time PCR (qRT-PCR) to measure COL3A1 expression levels in all the COL3A1:5E and COL3A1 transgenic plants obtained. Compared with that in wild-type (WT) plants, the transcription level of the COL3A1 gene in both COL3A1 (Figure 2a,b) and COL3A1:5E transgenic (Figure 2c) plants was significantly greater. The expression level of the COL3A1 gene in the COL3A1:5E transgenic plants was also significantly greater than that in the COL3A1 transgenic plants (Figure 2d). To obtain lines with relatively high rhCOL3 protein expression and confirm that the genes encoding these enzymes were expressed, we selected 12 lines of COL3A1:5E transgenic plants whose COL3A1 mRNA expression levels were relatively high and whose exons were correctly spliced to determine the transcription levels of the genes encoding these five enzymes. The expression levels of all the enzyme-encoding genes in most of the transgenic lines were higher than those in the WT plants (Figure 2e–j), and their expression levels were approximately 1 to 3 times those in the WT. We also compared COL3A1 expression levels in the roots, stems, and leaves across these 12 lines. Overall, the expression level of COL3A1 in the leaves was approximately 1.5 times that in the stems, while the expression level in the stems was only half that in the roots (Figure 2k).
We used ELISA to determine the expression levels of rhCOL3 in the COL3A1 and COL3A1:5E transgenic lines with correct mRNA splicing. For each vector, we measured the protein expression levels of 12 transgenic lines. The results revealed that the protein expression levels of the COL3A1:5E transgenic lines were significantly greater than those of the COL3A1 transgenic lines (Figure 3a). The fresh leaves of COL3A1-transgenic plants contained approximately 1.5–6.9 mg/kg rhCOL3. In the COL3A1:5E transgenic lines, the content of rhCOL3 ranged from 8.5–44 mg/kg fresh leaf weight. We conducted WB analysis on the expression of rhCOL3 in these 12 COL3A1:5E lines. The results revealed that only the COL3A1:5E transgenic line 17 displayed the target protein band (Figure 3b). This line will be referred to as 7TU-77 hereinafter. Since ELISA detected protein content, whereas WB only detected the band of the most highly expressed line, we tested the antibody sensitivity using a standard sample. The results revealed a detection limit of 25–50 ng (Figure S2). The detection range of the ELISA kit was 12.5–800 ng/mL. We also performed WB using an antibody specific for the N-propeptide of type III procollagen on 7TU-77; pro-α chains were not detected (Figure 3c). Because the antibody reacts only with the N-propeptide of pro-α(III) chains, these results indicate that the N-propeptide of procollagen is cleaved by procollagen N proteinase in 7TU-77.
3.3. The 7TU-77 Line Has a Single Copy of Expression Cassettes
To further determine how many copies of transgenic expression cassettes the line with high rhCOL3 expression levels has in its genome, we conducted a Southern blot experiment. We used the restriction endonucleases HindIII and EcoRI to digest the genome of line 7TU-77. Probes were designed on the basis of the expression vector for hybridization. The labelled probes were 968 bp in length, and their positions on the vector are shown in Figure S3. By digesting the genome of 7TU-77 with two different restriction endonucleases and performing hybridization with probes, both samples showed a single band, indicating that the COL3A1:5E construct has only one copy on 7TU-77 (Figure 4a). We further accurately determined the insertion site of the transgenic construct in the 7TU-77 genome using TAIL-PCR to amplify unknown flanking sequences. After three rounds of amplification, we ultimately obtained a product approximately 750 bp in size (Figure S4). The PCR product was subsequently sequenced and aligned to the tobacco reference genome. The insertion site was located on an exon of a DHHC-type zinc finger family gene on chromosome 2 (Figure 4b).
3.4. Purification and Determination of the Triple Helix Structure and Thermal Stability
Among all the transgenic lines, the 7TU-77 line presented the highest protein expression level. Therefore, we propagated this line and extracted and purified the recombinant proteins from the leaves of its T2 generation plants using the salting-out andsize-exclusion chromatography method. The subsequent functional experiments were all based on the rhCOL3 protein derived from 7TU-77. SDS-PAGE analysis was performed on the purified and concentrated recombinant protein, and the protein was stained with Coomassie blue (Figure 5a). Through quantitative ELISA, the yield of the first salting-out step was determined to be 92%, the yield of the second salting-out step was 95%, and the yield of the size exclusion chromatography step was 87%. The viscosity of the purified rhCOL3 solution is shown in Figure S5.
To determine whether rhCOL3 can assemble into a triple helix in tobacco, we used circular dichroism (CD) to analyse its structure. The CD results revealed a spectrum with a negative peak between 190 and 200 nm and a positive peak at approximately 220 nm (Figure 5b), which is the typical CD spectral characteristic of the collagen triple helix [25]. Subsequently, we tested the thermal stability of the triple helix structure by increasing the temperature at 220 nm using CD analysis. The thermal transition temperature (Tm) of rhCOL3 was 39 °C (Figure 5c).
PNP and PCP are crucial enzymes involved in the process of collagen biosynthesis and were coexpressed with COL3A1 in tobacco in this study. Their primary function is to remove propeptides from the ends of procollagen, converting it into mature collagen. Mature collagen is a prerequisite for fibril assembly. To investigate whether PNP and PCP can process mature rhCOL3 in tobacco, we induced fibril formation in fibrillogenesis buffer and then analysed the morphology using SEM. We observed characteristic long fibrils (Figure 5d), indicating that the rhCOL3 derived from 7TU-77 is mature collagen.
3.5. Detection of the Components and Functions of rhCOL3 Derived from 7TU-77
To determine whether the components of rhCOL3 derived from 7TU-77 are the same as those of natural human collagen, we conducted a quantitative amino acid analysis. The chromatograms of the standard sample and the 7TU-77 sample are shown in Figure S6. A total of 18 amino acids, including histidine, arginine, aspartic acid, serine, glycine, glutamic acid, proline, phenylalanine, alanine, lysine, tyrosine, methionine, valine, isoleucine, leucine, threonine, asparagine, and glutamine, were detected in rhCOL3. The three most abundant amino acids were glycine, proline, and alanine, which accounted for 33.2, 12.4, and 10.46%, respectively (Table S3). The levels of proline, leucine, phenylalanine, hydroxylysine, and lysine were numerically greater in 7TU-77-derived rhCOL3 than in human-derived type III collagen. The content of all the amino acids was almost equivalent to that of natural human type III collagen. The ratio of hydroxyproline/hydroxylysine was 11.3%, which was lower than the 15.85% ratio in the natural type III collagen.
To verify the functionality of the recombinant tobacco-derived collagen, we conducted toxicity tests on 7TU-77-derived rhCOL3 using human fibroblasts and mouse L929 fibroblasts. We treated L929 cells with rhCOL3 at concentrations of 0.01, 0.05, 0.1, and 0.3 mg/mL for 24 h and used DMEM as the control (Figure S7). The viability of mouse L929 cells treated with 0.01 and 0.05 mg/mL rhCOL3 was significantly greater than that of the control group (Figure 6a). Next, we treated L929 cells with 0.01 mg/mL rhCOL3 and subsequently examined their cell adhesion using a denatured 10% BSA solution as a control. The adhesion rate of the cells treated with rhCOL3 was greater than that of the control group but lower than that of the human type III collagen group (Figure 6b). We conducted a cell migration experiment on human fibroblasts. A scratch was made in the wells of the cell culture plate, and the cells were treated with rhCOL3 at concentrations of 0.0625, 0.125, and 0.25 mg/mL. The cell healing rate in the three treated groups was greater than that in the blank control group treated with serum-free medium. FBS (10%) culture medium was used as a positive control (Figure 6c and Figure S8).
4. Discussion
Collagen fibrils and their denatured derivative, gelatine, are the main constituents of many medical products, such as implants, haemostats, vascular stent coatings, and cartilage repair materials [26]. The use of collagen derived from animals may carry a risk of transmitting pathogens, including zoonotic viruses such as prions. At present, no pathogenic bacteria that infect both humans and plants have been discovered. Moreover, many people worldwide have dietary or religious restrictions, leading them to seek collagen products from other sources. Recombinant human collagen materials offer a desirable alternative option. In this study, we fully demonstrated the feasibility of successfully synthesizing functional mature human type III collagen in tobacco.
The rhCOL3 expression level in the COL3A1:5E transgenic lines was greater than that in the COL3A1 transgenic lines. The qRT-PCR results revealed that the five enzyme-encoding genes were expressed in the transgenic lines, indicating that the five posttranslational modification enzymes promoted COL3A1 expression in tobacco. In this study, the expression level of the rhCOL3 protein was significantly greater in the COL3A1:5E transgenic lines than in the COL3A1 transgenic lines. This might be due to the coexpression of prolyl hydroxylase or the human RNA demethylase FTO. Human FTO was initially identified as a protein associated with body mass and obesity [27]. FTO-mediated m6A demethylation promotes chromatin openness and induces transcriptional activation in rice and potato, thereby increasing their growth and yield [21].
After procollagen is secreted outside the cell, procollagen N proteinase and procollagen C proteinase cleave the propeptides of procollagen. Subsequently, the telopeptides of the collagen molecules are exposed, which mediate the interaction between collagen molecules [28]. These specific interactions allow for collagen molecules to assemble and bond through covalent bonds, forming collagen fibrils that constitute connective tissue. Therefore, only the collagen whose propeptides are removed can form a fibril. The induction of purified rhCOL3 in vitro can result in fibril formation, indicating that the rhCOL3 expressed in tobacco in this study is a mature collagen protein whose propeptides were removed. This approach eliminates the need for in vitro enzymatic cleavage after purification. Before being assembled into a triple-helix structure, each monomer undergoes a series of posttranslational modifications. First, approximately 145 proline residues out of the 239 residues in the triple helix region are hydroxylated by prolyl-4-hydroxylase to form 4-hydroxyproline. Second, some lysine residues are hydroxylated or glycosylated [2]. Human type III collagen, devoid of post-translational modifications, has a theoretical molecular weight of 95 kDa. In this study, the molecular weight of the mature rhCOL3 molecules we obtained exceeded 130 kDa. However, this molecular weight is similar to that of naturally extracted type III collagen from the human placenta, which was used in our antibody sensitivity test. This might be caused by posttranslational modifications of collagen. These results also indirectly indicate that the modification of rhCOL3 we obtained may be similar to that of natural type III collagen. Furthermore, these results further validate the considerable application potential of plant bioreactors in expressing mammalian proteins.
The thermostability of 7TU-77-derived rhCOL3 began to decrease when the temperature exceeded 34 °C, and was completely denaturized at 39 °C. The thermal stability change temperature was almost the same as that of natural type III collagen [29], confirming that rhCOL3 derived from 7TU-77 has high mammalian collagen properties. Cell experiments also demonstrated that 7TU-77-derived rhCOL3 enhances cell viability and promotes cell migration. These findings indicate that rhCOL3 derived from plants has the potential to be used as a medical repair material free from viral or bacterial contamination. In this study, the results of amino acid composition analysis showed slight variations in the amino acid percentages of our rhCOL3 versus the literature-reported values for the native human type III collagen. These differences are likely attributable to interference from trace impurities and systematic methodological errors. The lower Hyp/Hyl ratio observed in rhCOL3 relative to its native counterpart resulted from the slightly elevated hydroxylation level of lysine residues. As a critical structural residue involved in hydrogen bond formation within the collagen triple helix, the increased Hyl content strengthens intramolecular and intermolecular hydrogen bonding interactions, thereby conferring greater conformational stability to the triple helical structure of rhCOL3.
Among all the COL3A1:5E transgenic lines obtained, only the 7TU-77 line presented the highest expression level and could be detected by WB. To explore the reasons for its high expression level, we determined the copy numbers of its transgenic expression cassettes in the genome and the insertion site. The Southern blot results indicated that there was only one copy. These findings suggest that copy number may not be a significant factor in the expression of exogenous proteins. The insertion site of the single-copy 7TU-77 transgenic expression cassette indeed disrupted the function of a gene. The damaged gene is a DHHC-type zinc finger family protein. The genes in this family are usually involved in S-acylation, also known as palmitoylation, a crucial lipid modification of proteins. Protein S-palmitoylation is a reversible post-translational modification that can alter the localization, stability, protein-protein interactions, and signal transduction function of hundreds of proteins and increase the hydrophobicity of specific protein substructures in cells [30,31]. Therefore, its damage may directly or indirectly affect collagen stability. This assumption requires further experimental verification.
With respect to exogenous genes, single-copy insertion has a simple genetic structure, which is more stable in terms of plant inheritance and reduces the risk of genetic recombination and gene rearrangements that lead to genetic variation. The expression of exogenous genes in single-copy plants is stable, whereas that in multicopy transgenic plants is susceptible to gene silencing, leading to a decrease in gene expression levels [32]. It has been reported that there is no correlation between the number of transgenic copies and the expression level in transgenic plants obtained from Agrobacterium-mediated transformation [33]. In this study, compared with the other lines, the transgenic single-copy 7TU-77 line still presented higher protein expression, further confirming that there is no inherent correlation between the expression level and the number of transgenic copies. Whether this is related to the insertion site or not is a complex issue that warrants attention.
This study is the first to express full-length type III collagen in plants, and the results verified its feasibility. This study provides a reference for further in-depth research on plant-derived full-length type III collagen. Previous studies have successfully expressed full-length type I collagen in tobacco, maize, and barley [19,20,34,35]. At present, the successful development of skin and soft tissue fillers using plant-derived collagen and 3D bioprinting scaffolds has been achieved [36]. Further studies on the extensibility of plant-derived collagen will continue.
In conclusion, our work has demonstrated a promising method for producing recombinant collagen in plant bioreactors. The COL3A1 gene, with 15 introns, was successfully expressed as recombinant type III collagen for the first time in the vacuoles of tobacco cells. Moreover, we expressed LH, P4H, PCP, and PNP in vacuoles to modify rhCOL3. We also expressed FTO in the cytoplasm to promote its transcription. The physical properties of mature type III recombinant collagen extracted from the transgenic tobacco line 7TU-77 are highly similar to those of natural human type III collagen. It significantly affects cell viability and migration and can automatically form collagen fibrils in an appropriate buffer system (Figure 7). This demonstrates a green and friendly production method for recombinant collagen. This work provides a safe and environmentally friendly option for medical regeneration and skin repair materials.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Pillai N.S. Khan S.A. Mehrotra N. Jadhav K. A comprehensive review on the role of collagen in health and disease Biosci. Biotechnol. Res. Asia 2024211329134710.13005/bbra/3307 · doi ↗
- 2Kuivaniemi H. Tromp G. Type III collagen (COL 3A 1): Gene and protein structure, tissue distribution, and associated diseases Gene 201970715117110.1016/j.gene.2019.05.00331075413 PMC 6579750 · doi ↗ · pubmed ↗
- 3Bächinger H.P. Bruckner P. Timpl R. Prockop D.J. Engel J. Folding mechanism of the triple helix in type-III collagen and type-III p N-collagen. Role of disulfide bridges and peptide bond isomerization Eur. J. Biochem.198010661963210.1111/j.1432-1033.1980.tb 04610.x 7398630 · doi ↗ · pubmed ↗
- 4N’Diaye E.N. Cook R. Wang H. Wu P. La Canna R. Wu C. Ye Z. Seshasayee D. Hazen M. Lin W. Extracellular BMP 1 is the major proteinase for COOH-terminal proteolysis of type I procollagen in lung fibroblasts Am. J. Physiol. Cell Physiol.2021320 C 162C 17410.1152/ajpcell.00012.202033206546 · doi ↗ · pubmed ↗
- 5Gallegos J.E. Rose A.B. The enduring mystery of intron-mediated enhancement Plant Sci.201523781510.1016/j.plantsci.2015.04.01726089147 · doi ↗ · pubmed ↗
- 6Brinster R.L. Allen J.M. Behringer R.R. Gelinas R.E. Palmiter R.D. Introns increase transcriptional efficiency in transgenic mice Proc. Natl. Acad. Sci. USA 19888583684010.1073/pnas.85.3.8363422466 PMC 279650 · doi ↗ · pubmed ↗
- 7Hoshida H. Kondo M. Kobayashi T. Yarimizu T. Akada R. 5′-UTR introns enhance protein expression in the yeast Saccharomyces cerevisiae Appl. Microbiol. Biotechnol.201710124125110.1007/s 00253-016-7891-z 27734122 · doi ↗ · pubmed ↗
- 8Callis J. Fromm M. Walbot V. Introns increase gene expression in cultured maize cells Genes Dev.198711183120010.1101/gad.1.10.11832828168 · doi ↗ · pubmed ↗
