Functional characterization of a dockerin-containing expansin-like protein from the anaerobic fungus Neocallimastix californiae
Taru Koitto, Anna Pohto, Elizaveta Sidorova, Thu V. Vuong, Merja Penttilä, Emma R. Master

TL;DR
This study identifies and characterizes a fungal protein that enhances cellulose breakdown, offering insights into how anaerobic fungi digest plant material.
Contribution
The first functional characterization of a fungal cellulosomal expansin-like protein with dockerin domains.
Findings
NcaEXLX1 binds more strongly to cellulose than its truncated version.
Both NcaEXLX1 and its truncated form enhance the activity of an endoglucanase.
Enhanced cellulase activity is not directly linked to the protein's binding strength.
Abstract
Anaerobic microbes produce multienzyme complexes known as cellulosomes to enhance the degradation of cellulosic substrates. These complexes localize diverse enzymes onto a protein scaffold, where proteins are anchored by dockerin domains. Although the cellulosomes of anaerobic fungi incorporate a broad array of cellulolytic enzymes, they remain largely unexplored. Notably, some fungal cellulosomes reportedly comprise expansin-like proteins with potential to disrupt cellulose networks. While two bacterial cellulosomal expansin-like proteins have been characterized, no fungal cellulosomal expansin-like proteins have been functionally characterized to date. Sequence analyses of expansin-like proteins from the anaerobic fungus Neocallimastix californiae revealed similar N-terminal domains among proteins with or without appended dockerins. Those without dockerins, however, consistently…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6- —https://doi.org/10.13039/501100003125Suomen Kulttuurirahasto
- —https://doi.org/10.13039/501100004022Jenny ja Antti Wihurin Rahasto
- —https://doi.org/10.13039/501100000038Natural Sciences and Engineering Research Council of Canada
- —https://doi.org/10.13039/100010661Horizon 2020 Framework Programme
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiofuel production and bioconversion · Enzyme Production and Characterization · Fungal and yeast genetics research
Background
Lignocellulose, the most abundant source of renewable biomass, can be enzymatically deconstructed to sugars for fermentation to defossilized fuels and higher value chemicals. However, the tightly packed structure of lignocellulose limits enzyme efficiency and remains a barrier to enzymatic valorization of cellulose-based materials [1, 2]. Non-lytic proteins are hypothesized to increase the accessibility of cellulose for lytic enzymes by loosening the lignocellulose structure, thus facilitating its breakdown [1, 3]. Among these proteins are expansin-like proteins, which are predicted to disrupt non-covalent bonds between cellulose and matrix polysaccharides (e.g., pectin and hemicelluloses) or neighboring cellulose microfibrils at so-called tight junctions within lignocellulose substrates [4, 5].
Expansins were first discovered in plants, where they loosen the plant cell wall during acid-induced growth [6]. These proteins consist of two domains: an N-terminal domain (D1) structurally similar to glycoside hydrolases from family 45 (GH45), and a C-terminal domain (D2) which is distantly related to group-2 grass pollen allergens and belongs to the carbohydrate binding module (CBM) family 63 (http://www.cazy.org) [7, 8]. Homologous two-domain proteins found in bacteria and fungi are referred to as expansin-like proteins [9, 10]. Microbial expansin-related proteins extend the classification to loosenins that comprise only the D1 domain [11, 12] and swollenins that include both D1 and D2 domains along with an additional N-terminal fibronectin III domain and CBM1 [13].
To date, BsEXLX1 from Bacillus subtilis is the most extensively studied expansin-like protein. It has been shown to weaken cellulose networks, and amino acids critical to cellulose weakening and cellulose binding have been identified [10, 14–16]. Among the fungal proteins, several studies have focused on the characterization and application of the swollenin (SWO1) from Trichoderma reesei [13]. Briefly, SWO1 has been reported to weaken filter paper, induce swelling in cotton fibers, improve the action of different lignocellulosic enzymes, and display weak hydrolytic activity on cellulosic substrates [13, 17–19]. Other expansin-related proteins, such as loosenin-like proteins (LOOLs) from Phanerochaete carnosa, have been shown to weaken filter paper [20], increase the interfibrillar distance of cellulose microfibrils [21], and boost the enzymatic deconstruction of complex lignocellulosic substrates [22]. Notably, several microbial expansin-like and expansin-related proteins have shown potential to boost the enzymatic hydrolysis of lignocellulose [3, 22, 23]; however, this potential depends significantly on the choice of substrate, lytic enzyme, and enzyme loading [3, 24]. Although the molecular mechanism of expansins as a whole remains unclear, their impact on lytic enzymes is reportedly higher and more consistent than the non-specific effects of reference proteins, such as bovine serum albumin [22, 25, 26].
The majority of expansin-related proteins studied thus far have originated from aerobic microbes. However, anaerobic bacteria and fungi found in the digestive track of ruminant and non-ruminant herbivores also contain expansin-related proteins within their cellulosomes [25, 27–29]. Cellulosomes are extracellular complexes that integrate enzymes for lignocellulose degradation [29, 30]. These complexes consist of a scaffoldin, a non-catalytic multidomain protein containing multiple cohesin domains to which enzymes and proteins with dockerin domains can attach. Cellulosomes from the anaerobic bacteria Clostridium thermocellum were first reported in the 1980s [31, 32]; since then, cellulosomes have been identified in other anaerobic bacteria [29]. While the structures and interactions of bacterial cohesins and dockerins are well-characterized [29, 33], the understanding of fungal cellulosomes remains limited [34]. To date, only two fungal dockerin structures have been resolved [35, 36]. While scaffoldins with repeating motifs, potentially acting as cohesin domains, have been identified [28, 37], the cohesins themselves have not been fully characterized.
Notably, fungal dockerins and scaffoldins lack similarity to their bacterial counterparts, suggesting an independent evolutionary origin [28, 35, 36]. Whereas bacterial cellulosomal proteins contain a single dockerin domain usually at the C-terminus [38, 39], fungal proteins often have duplicate dockerins, though single and triplicate dockerin domains also exist, and the dockerin domains can occur at either the N- or C-terminus [28, 36, 37]. Moreover, fungal dockerins can attach to cellulosomes from other fungal species [28, 40], whereas bacterial dockerin–cohesin interactions tend to be species specific [29, 30]. Multiple dockerin-containing proteins have been characterized before, including two cellulosomal expansin-like proteins (Clocl_1298 and Clocl_1862) from Clostridium clariflavum [25, 27]. Both expansin-like proteins were shown to improve cellulase performance [25, 27], and the cellulose hydrolysis was highest when the expansin-like proteins were bound to the cellulosome; however, both expansin-like proteins retained their ability to boost the cellulase activity after removing their dockerin domains [25]. Similarly, Artzi et al. [27] reported improved cellulase performance in the presence of Clocl_1862, where the impact was greatest when pretreating the cellulose with the expansin-like protein and comparatively low cellulase doses were used [27].
Whereas earlier studies of cellulosomal expansin-like proteins focus on those derived from anaerobic bacteria, this study investigates a cellulosomal expansin-like protein from the anaerobic fungus Neocallimastix californiae. N. californiae is found in rumen gut encoding 2480 different CAZymes and over 420 dockerin-containing proteins, including three expansin-like proteins [28]. While the biological role of expansin-like proteins remains unclear, their potential to boost cellulolytic activity suggests that some support the action of cellulolytic enzymes; this may be especially true for cellulosomal expansin-like proteins encoded by potent lignocellulose degraders. To test this hypothesis, the cellulosomal expansin-like protein from N. californiae (NcaEXLX1) was studied using quartz crystal microbalance with dissipation (QCM-D) to measure protein binding to cellulose and its effect on cellulolytic activity. NcaEXLX1 was produced with and without the dockerin domains to investigate how the dockerins impact protein activity. This study revealed that while the dockerin domains impacted NcaEXLX1 binding to a cellulosic substrate, their presence did not substantially impact the potential of NcaEXLX1 to boost cellulase activity.
Materials and methods
Materials
CNF was prepared from never-dried bleached kraft birch pulp using a homogenizer (Voith LR40), as described in Österberg et al. [41]. Polyethylene imine (PEI) was purchased from Sigma-Aldrich (Catalog No. 03880). Pierce Bicinchoninic acid (BCA) protein assay kit was purchased from Thermo Fisher Scientific (catalog no. 23225). The endo-1,4-β-d-glucanase (Cel7B) from Trichoderma longibrachiatum was purchased from Megazyme (Catalog No. E-CELTR).
Sequence analysis
The amino acid sequence of NcaEXLX1 (JGI protein ID: jgi|Neosp1|458000) was compared to other expansin-like proteins from Neocallimastix californiae. The amino acid sequences of expansin-like proteins from Neocallimastix californiae G1 v1.0, which have evidence for expression at the transcript level, were retrieved from Joint Genome Institute (JGI). The sequence similarity of N. californiae expansin-like proteins was analyzed with Clustal Omega sequence alignment [42]. The signal peptide of the proteins was predicted with SignalP-6.0 [43]. AlphaFold3 was used to model the protein structures [44] and ChimeraX for the visualization of the protein models [45]. N- and O-glycosylation sites were predicted using NetNGlyc 1.0 and NetOGlyc 4.0 servers [46, 47]. The pI was predicted using ProtParam [48]. Disulfide bonds were estimated using Disulfide by Design 2 server (http://cptweb.cpt.wayne.edu/DbD2/) [49]. Sequence alignments were visualized using ESPript3 (http://espript.ibcp.fr) [50]. Intrinsically disordered regions of the proteins were predicted using AIUPred [51].
Protein production and purification
The gene for NcaEXLX1 was synthesized by the Joint Genome Institute (JGI, USA), and the gene for the truncated NcaEXLX1 (NcaEXLX1tr) was synthesized by ATUM (USA). Both genes were cloned into pPICZα plasmids, which add a C-terminal Myc-His-tag to the protein when heterogeneously expressed in Komagataella phaffii (previously Pichia pastoris) KM71H. Transformants expressing NcaEXLX1 and NcaEXLX1tr were prepared at Concordia University (Montreal, Canada) and the University of Toronto (Toronto, Canada), respectively, and shipped to Aalto University for expression trials.
NcaEXLX1 was produced in a 7-L Sterilizable-In-Place (SIP) bioreactor (Biostat Cplus, Sartorius) according to Pichia Fermentation Process Guidelines (Invitrogen) with some modifications as previously described in Pohto et al. [52]. Briefly, a 300 mL pre-culture was grown in shake–flask until the optical density at 600 nm (OD_600_) reached approximately 50. The pre-culture medium was then replaced with around 50 mL of basal salt media (BSM: 2.27% (v/v) phosphoric acid, 0.093% (w/v) calcium sulfate, 1.82% (w/v) potassium sulfate, 1.49% (w/v) magnesium sulfate heptahydrate, 0.413% (w/v) potassium hydroxide, 4% (w/v) glycerol; pH adjusted to 5 with ammonium hydroxide), which was subsequently used to inoculate a bioreactor containing 4 L of BSM supplemented with 0.4% (v/v) PTM_1_ trace salts. Dissolved oxygen was set at 35% and pH of the cultivation was maintained at pH 5.5 with 14% ammonium hydroxide. Temperature was kept at 30 °C during glycerol batch and fed-batch phase, then the temperature was lowered to 20 °C for methanol induction phase, where 0.4% (v/v) of methanol (supplemented with PTM_1_ trace salts) was fed daily for 88 h. For harvesting, the pH was increased to 7.8 with 4 M NaOH and the culture was centrifuged for 20 min at 18,500 × g. The supernatant was then filtered with a microfiltration membrane with 0.45 µm pore size (Sartocon Hydrosart Slice Cassette) using a crossflow system (Sartoflow Study, Sartorius). The supernatant was concentrated with a 10 kDa cutoff ultrafiltration membrane (Sartocon Hydrosart Slice Cassette, Sartorius), and the purification of the concentrated supernatant was done using nickel nitriloacetic acid (Ni–NTA) agarose resin (Qiagen, Catalog No. 30230). The purified protein was concentrated and buffer-exchanged into 10 mM sodium acetate pH 6.0 using Vivaspin Turbo 4 ultrafiltration units with 10 kDa molecular weight cutoff polyethersulfone (PES) membranes (Sartorius, Catalog No. VS2002).
NcaEXLX1tr was produced in shake–flask according to the Pichia Expression Kit manual (Invitrogen, Thermo Fischer Scientific). K. phaffii cells were grown in 2.5-L Tunair flasks with 500 mL of buffered glycerol-complex medium (BMGY; 100 mM potassium phosphate buffer pH 6.0, 2% (w/v) peptone, 1% (w/v) yeast extract, 1.34% (w/v) yeast nitrogen base, 4 × 10^–5^% (w/v) biotin, 1% (v/v) glycerol) at 30 °C and 200 rpm. The cultures were centrifuged for 5 min at 1500 × g when the OD_600_ was around 6, and the cells were resuspended to 100 mL of buffered methanol-complex medium (BMMY; 100 mM potassium phosphate buffer pH 6.0, 2% (w/v) peptone, 1% (w/v) yeast extract, 1.34% (w/v) yeast nitrogen base, 4 × 10^–5^% (w/v) biotin, 0.5% (v/v) methanol). The cultivation was continued in 500-mL Erlenmeyer flasks at 20 °C, and 0.5% (v/v) methanol was added every 24 h. After 96 h of induction, the media was centrifuged (3214 × g, 30 min, 4 °C), and the supernatant was filtered with a 0.45 µm PES membrane. The purification of the protein was done with ÄKTA Go (Cytiva) using a 5-mL HisTrap Fast Flow Crude column (Cytiva). The purified protein was concentrated and buffer—exchanged to 10 mM sodium acetate pH 6.0, as described above.
The purified proteins were stored in aliquots at −80 °C. The protein concentration was measured using the BCA protein assay kit (Pierce, Thermo Fisher Scientific), and the purity of the proteins was assessed by SDS–PAGE. The protein identities of the purified proteins were confirmed with MALDI–TOF/TOF mass spectroscopy (HiLIFE, Meilahti Clinical Proteomics Core Facility).
Circular dichroism and nanodifferential scanning fluorimetry
Circular dichroism (CD) spectroscopy was performed using a Chirascan CD spectrometer (Applied Photophysics, Leatherhead, UK) as previously described in Koitto et al. [26]. In brief, the CD data were collected between 280 and 190 nm at 20 °C using a 0.1 cm path-length quartz cuvette and 0.1 mg/mL of each protein in 10 mM sodium acetate pH 6.0. The CD spectra were collected at 1 nm intervals with an integration time of 0.5 s per point, and each measurement was performed in triplicate with baseline correction. Data analysis was conducted using Chirascan Pro-Data Viewer (Applied Photophysics), and secondary structure estimations were performed using BeStSel (https://bestsel.elte.hu/index.php) [53]. The direct CD measurements (θ; mdeg) were converted into mean residue molar ellipticity ([θ]MR) by Pro-Data Viewer. Thermal denaturation of the proteins was measured by collecting CD spectra in the same setup with temperature ramped from 20 °C to 90 °C at a constant rate of 1 °C/min using a Peltier Temperature Control TC125 (Quantum Northwest, Liberty Lake, WA). CD spectra were acquired at every 1 °C increment, and the melting temperature (T_m_) was determined using Global3 software (Applied Photophysics).
The thermal stability and possible protein aggregation during the thermal stability test was also analyzed by nano differential scanning fluorimetry (nanoDSF). NanoDSF was done using the Prometheus NT.48 instrument (NanoTemper Technologies) with capillaries containing 10 μL of protein sample at a concentration of 1 mg/mL in 10 mM sodium acetate pH 6.0. The measurement was conducted as previously reported in Koitto et al. [26]; the temperature was gradually increased from 20 °C to 90 °C at a rate of 1 °C per minute. The excitation wavelength was 280 nm and the ratio of emission intensities (Em 350 nm/Em 330 nm) was recorded. The fluorescence intensity ratio and its first derivative were calculated using PR.ThermControl (NanoTemper Technologies) and further analyzed with the PR.Stability Analysis tool (NanoTemper Technologies). The protein’s melting temperature (T_m_) was determined using MoltenProt (https://spc.embl-hamburg.de/app/moltenprot) [54, 55].
Thin film preparation
CNF thin films were prepared on gold-coated quartz crystal sensors (Advanced Wave Sensors S.L., 5 MHz) using PEI as an anchoring layer, as previously reported [26]. Prior to film preparation, the sensors were cleaned by UV–ozone treatment for 10 min, followed by 10 min washing at 75 °C in a solution containing 25% ammonia and 30% hydrogen peroxide in Milli-Q water (1:1:5 v/v/v), after which the sensors were rinsed with Milli-Q water, dried with nitrogen, and kept for 10 min in the UV–ozone cleaner. The CNF dispersion for the film was prepared by sonicating a 0.2% (w/v) CNF–water dispersion at 25% amplitude for 1 min (Branson Digital Sonifier S-450D), after which the dispersion was centrifuged (8000 × g, 30 min), and the supernatant was used for the thin film preparation. Before adding the CNF dispersion to the sensors, PEI was drop-coated by keeping a 2.5 mg/mL PEI solution on the sensor for 10 min, followed by a rinse with Milli-Q water and drying with nitrogen. CNF dispersion was then added onto the sensors and spin-coated at 4000 rpm for 1 min (Laurell Technologies WS-650SX-6NPP/LITE).
Adsorption studies using quartz crystal microbalance with dissipation (QCM-D)
The adsorption of the proteins was performed similar to Koitto et al. [26] with minor modifications. In brief, the adsorption of NcaEXLX1 and NcaEXLX1tr onto CNF thin films was studied by QCM-D (Biolin Science, Q-Sense E4). Before the adsorption of the proteins, the films were stabilized by running 50 mM sodium acetate buffer (pH 5.0) over the sensor until stable frequency and dissipation were obtained. Once the baseline was stable, 1 μM NcaEXLX1 or NcaEXLX1tr in 50 mM sodium acetate buffer (pH 5.0) was pumped to the chamber for 2 h with a flow rate of 100 μL/min. The protein solution was then replaced with 50 mM sodium acetate buffer (pH 5.0) to remove the reversibly bound protein. All experiments were carried out at 25 °C and repeated at least twice.
The mass of adsorbed protein on the film was estimated using Sauerbrey equation (1) [56], where the change in frequency is directly proportional to the mass change. The Sauerbrey equation applies when the film is rigid and evenly distributed, and the adsorbed masses are relatively small compared to the mass of the sensor.
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta m\, = \, - C\,\Delta f/n$$\end{document}where Δm is the change in mass per unit area in g cm^−2^, C is the mass sensitivity constant, Δf is the change in resonance frequency, and n is the overtone number.
QCM-D studies of endoglucanase action on cellulose nanofibrils
Impacts of the NcaEXLX1 and NcaEXLX1tr on endo-1,4-β-D-glucanase (Cel7B) activity were studied with QCM-D. The films were first stabilized with 50 mM sodium acetate buffer (pH 5.0), after which 1 μM NcaEXLX1 or NcaEXLX1tr in 50 mM sodium acetate buffer (pH 5.0) was pumped to the chamber for 30 min with a flow rate of 100 μL/min. The reversibly bound protein was then washed with the buffer for 30 min, followed by a 30-min feeding of Cel7B solution (25 μg/mL) in 50 mM sodium acetate buffer (pH 5.0) at a flow rate of 100 μL/min. The flow was stopped after 30 min, and the hydrolysis of the CNF film was monitored for up to 4 h. The initial adsorption rates were calculated from the first 3 min of protein adsorption using mass values obtained from the Sauerbrey equation (1). The initial rates of enzymatic hydrolysis were calculated from slope values obtained from pseudo-steady state [57] where the hydrolysis rate was linear. The initial hydrolysis rates for Cel7B on CNF were obtained with and without CNF pretreatment with an expansin-like protein, and the statistical significance of the observed differences was analyzed using Welch’s t test. After each experiment, the sensors were rinsed with Milli-Q water and dried with nitrogen before imaging with AFM.
Atomic force microscopy (AFM)
The AFM measurements were done as previously described [26]. In brief, the CNF thin films were imaged from at least two different positions before and after the QCM-D experiments using a MultiMode 8 atomic force microscope equipped with a NanoScope V controller (Bruker, Santa Barbara, CA). The images were taken with HQ:NSC15 probes (MikroMasch) in air using tapping mode. The images were processed with NanoScope Analysis 1.5, and flattening was the only image correction used.
Results and discussion
Sequence analysis of NcaEXLX1 and comparison to other expansin-like proteins
Expansin-like proteins comprise an N-terminal D1 domain that is structurally similar to the GH45, and a C-terminal D2 domain that has been classified as a CBM63. Transcriptomic data indicate that N. californiae express seven proteins containing both of these domains [28], with sequence identity ranging from 34% to 93% (Additional File 1: Fig. S2). Among the seven expansin-like proteins in N. californiae, three were found to contain dockerins and so likely associate with fungal cellulosomes (jgi|Neosp1|458000; jgi|Neosp1|517403; jgi|Neosp1|697018). The sequences of the cellulosomal expansin-like proteins were then compared to prioritize a selection for functional characterization.
All three dockerin-containing expansin-like proteins retained identical residues at key positions within the D1 active center (Additional File 1: Fig. S1). This includes a key aspartic acid in the active center of D1 (corresponding to Asp82 in BsEXLX1), which is essential for cell wall creep activity, as well as another aspartic acid (Asp71 in BsEXLX1) that similarly contributes to the activity [14]. In addition, the sequences retain two amino acids that are predicted to stabilize the active center: a threonine (Thr12 in BsEXLX1) which is conserved in all expansins, and a tyrosine residue, conserved in fungal expansin-like proteins [52, 58–60]. Furthermore, these sequences contained a widely conserved disulfide bridge in a loop close to the catalytic aspartic acid [52, 58, 59]. This disulfide bridge is predicted to be part of an extended putative binding site (PBS) in expansins [7, 52]. In addition to the disulfide bridge near the catalytic region, the sequences contained two other disulfide bridges within the D1 domain, a feature conserved among fungal expansin-like proteins [52, 58].
Although no significant differences were detected in the D1 domain of cellulosomal expansin-like proteins of N. californiae, variation was observed in the PBS of the D2 (Additional File 1: Fig. S1). Three aromatic residues as well as a lysine residue (Lys119 in BsEXLX1) have been shown to contribute to cellulose binding [14, 15]. The lysine residue was present in all the cellulosomal N. californiae expansin-like sequences. However, one protein (jgi|Neosp1|517403) lacked one of the aromatic residues (Tyr157 in BsEXLX1), suggesting that this protein may bind weaker to the targeted substrate compared to the other expansin-like proteins or target a different substrate. As one of the objectives of this study was to assess cellulose adsorption and the impacts of expansin-like proteins on endoglucanase activity, proteins with a potentially higher affinity for cellulose were prioritized. Two of the three dockerin-containing proteins shared high sequence identity (82%) and contained all three conserved aromatic residues in the PBS. Therefore, one of these sequences (jgi|Neosp1|458000) was selected for further characterization and is hereafter referred to as NcaEXLX1.
The D1 and D2 of NcaEXLX1 were further compared to the predicted expansin-like proteins in N. californiae that lack dockerins (jgi|Neosp1|458440; jgi|Neosp1|667364; jgi|Neosp1|674183; jgi|Neosp1|392871) to assess whether cellulosomal and non-cellulosomal expansin-like proteins exhibit sequence differences that might reflect distinct functional roles. The D1 domain of NcaEXLX1 shared relatively high sequence identity (66–71%) with these expansin-like proteins (Additional File 1: Fig. S2). By contrast, the D2 domain of NcaEXLX1 shared relatively low sequence identity (49–56%) with the non-cellulosomal expansin-like proteins. As previously noted, NcaEXLX1 retains all three aromatic residues conserved at the PBS. However, while the first conserved aromatic residue (corresponding to Tyr125 in BsEXLX1) is typically a tyrosine [58], it is replaced by a phenylalanine in NcaEXLX1 (Phe169). Notably, this position lacks an aromatic residue altogether in the non-cellulosomal expansin-like proteins, where it is instead occupied by a polar amino acid, such as serine or threonine. The aromatic acid residue at this position has been shown to be crucial for cellulose binding [14]; therefore, its replacement may indicate that non-cellulosomal expansin-like proteins in N. californiae have evolved to target different substrates.
Compared to BsEXLX1, NcaEXLX1 along with the other expansin-like proteins encoded by N. californiae retain an N-terminal extension of 32–37 amino acids. In NcaEXLX1, this extension comprises approximately 34 residues, is predicted to form a helical structure, and includes a putative N-glycosylation site. Although shorter N-terminal extensions (< 20 amino acids) have been reported in some expansin-like proteins, their functional role remains unclear [7, 26]. In addition, the linker between the D1 and D2 domains of NcaEXLX1, and other expansin-like proteins from N. californiae, is proline-rich and spans approximately 19 amino acids, which is significantly longer than the typical 8–10 residue linker observed in most other expansin-like proteins [9, 10, 52].
In addition to the D1 and D2 domain, NcaEXLX1 has two C-terminal dockerin domains, which are linked to the CBM63 domain via a long linker (37 aa) rich in serine and threonine residues (Fig. 1). This linker is predicted to be glycosylated, with one potential N-glycosylation site and 17 predicted O-glycosylation sites. Structural predictions based on the AlphaFold3 model indicate that both dockerins adopt the characteristic fungal dockerin fold, consisting of a three-stranded β-sheet and a short α-helix [35]. Each dockerin is predicted to contain two disulfide bonds in positions conserved across previously described fungal dockerins [35, 61, 62]. In addition, both NcaEXLX1 dockerins possess a third disulfide bond that links the middle of the domain to the start of the linker, an arrangement observed in some other fungal dockerins [35]. Ragothama et al. [36] have previously identified three key residues in the putative ligand binding site conserved in fungal dockerins (Tyr8, Asp23, and Trp35 in Piromyces equi Cel45A). All of these residues are conserved in both dockerins of NcaEXLX1 (Additional File 1: Fig. S3). The two dockerins are separated by a short linker of approximately 10 amino acids, bringing them into close proximity, consistent with previous observations of fungal proteins containing tandem dockerins [35, 62].Fig. 1. AlphaFold3 model of NcaEXLX1.** A** AlphaFold3 model showing different domains of NcaEXLX1. GH45-like domain (D1) shown in blue, CBM63 domain (D2) shown in orange, the linker between D1 and D2 shown in red, and dockerins with the connecting linker shown in green. B AlphaFold3 model showing D1 and D2 with amino acids important for polysaccharide binding and cell wall modification. The key aspartic acid Asp115 (corresponding to Asp82 in BsEXLX1) shown in bold. Thr44 and Asp104 correspond to Thr12 and Asp71 in BsEXLX1, respectively. Tyr46 corresponds to Thr14 in BsEXLX1, which in fungal expansin-like proteins is usually replaced by Tyr-residue. Lys163, Phe169, Trp170, and Tyr201 are part of the binding surface in D2 and correspond to Lys119, Trp125, Trp126, and Tyr157, respectively, in BsEXLX1
To investigate the impact of the dockerins on NcaEXLX1 action, the protein was produced in both full-length and truncated forms. The truncated version, lacking the linker following the CBM63 domain and the two C-terminal dockerins, is hereafter referred to as NcaEXLX1tr, while the full-length protein is referred to as NcaEXLX1. NcaEXLX1 and NcaEXLX1tr have predicted pI values of 4.41 and 4.62, respectively. The predicted molecular weights of NcaEXLX1 and NcaEXLX1tr based on their amino acid sequences and including the affinity tags are 45.0 kDa and 31.1 kDa, respectively. However, both proteins exhibited higher electrophoretic molecular weights, consistent with predicted glycosylation (Additional File 1: Fig. S4).
Circular dichroism analysis
The secondary structures of NcaEXLX1 and NcaEXLX1tr were assessed by CD analysis to confirm proper folding. Both proteins adopted a β-sheet rich fold typical for expansin-like proteins with double-psi-β-barrel (DPBB) structure in D1 and Ig-like β-sandwich in D2 [10] (Fig. 2). The CD spectra indicated that the full length NcaEXLX1 has a disordered helix. Since this was not observed for NcaEXLX1tr, the disordered helix likely represents the dockerins which typically adopt small helical structures [35]. Consistent with this interpretation, AIUPred [51] predicted the alpha-helix of the first dockerin domain (residues 289–296 in NcaEXLX1) to have a disordered binding region that would fold correctly upon substrate binding.Fig. 2. Mean residual molar ellipticity ([θ]MR) of NcaEXLX1tr** A and NcaEXLX1 B**. CD data collected between 190 and 280 nm at 20 °C using a 0.1 cm path-length quartz cuvette. Data were processed with Chirascan Pro-Data Viewer (Applied Photophysics), and the secondary structure was calculated with BeStSel (https://bestsel.elte.hu/index.php)
In addition, the thermal stability of the proteins was tested to see if the removal of the dockerin domains would impact the thermal stability. Tm values of NcaEXLX1tr and NcaEXLX1 were estimated by CD to be 47.7 °C and 50.3 °C, respectively. Melting temperature studies using nanoDSF resulted in similar values, 47.7 °C and 48.8 °C for NcaEXLX1tr and NcaEXLX1, respectively. No protein aggregation was detected from the light backscattering during the nanoDSF measurements (Additional File 1: Fig. S5). The impact of fungal dockerin domains on the thermal stability of cellulosomal proteins has been explored in previous studies. Andrade et al. [63] investigated an endoglucanase from Piromyces finnis containing dual dockerin domains located between the catalytic domain and CBM1. Deletion of the dockerin domains, along with CBM1, resulted in a 9 °C decrease in T_m_ and led to temperature-dependent protein precipitation not observed in the full-length protein [63]. Conversely, Huang et al. [61] found that removing dockerins from Neocallimastix frontalis xylanases, Xyn11A and Xyn11B, increased the T_m_ by approximately 6 °C and 4 °C, respectively [61]. These previous studies, together with the slight change (1–2 °C) in T_m_ observed here, suggest that the impact of dockerin domains on protein thermal stability will vary depending on the specific protein involved. The subsequent experiments were conducted below 40 °C to ensure the stability of the proteins.
Adsorption to cellulose nanofibrils
The adsorption of NcaEXLX1 and NcaEXLX1tr onto CNF was studied using QCM-D to investigate the impact of dockerin domains on the adsorption. The wild-type NcaEXLX1 revealed high affinity towards CNF, with an initial adsorption rate of 35 ng/cm^2^min; after 2 h, approximately 270 ng/cm^2^ of NcaEXLX1 had adsorbed on to the CNF layer (Fig. 3; Additional File 1: Fig. S6). By comparison, NcaEXLX1tr revealed low affinity towards CNF, with an initial adsorption rate of 6 ng/cm^2^min and an overall adsorption of about 80 ng/cm^2^ of protein on the CNF layer. The low affinity towards cellulose was expected, as expansin-like proteins have been previously reported to have lower binding capacity to highly crystalline cellulose than other type-A CBMs [15], higher affinity to whole plant cell walls than pure cellulose [14], and are reported to bind specific cellulose regions enriched with xyloglucan [64] or other hemicelluloses [65]. Similar to previously reported fungal expansin-like proteins [26], both NcaEXLX1 and NcaEXLX1tr remained bound to the CNF layer during the washing step.Fig. 3. Adsorption of NcaEXLX1 (blue) and NcaEXLX1tr (green) on CNF film.** A** Frequency from third overtone. B Dissipation from third overtone. Experiments were performed at 25 °C in 50 mM sodium acetate (pH 5.0) buffer with a flow rate of 100 μL/min. The film was first equilibrated with the buffer, followed by the injection of the protein solution (1 µM) for 2 h, after which the film was washed with the buffer alone. All experiments were repeated 3 times. Data shown here present the average adsorption behavior of the expansin-like proteins. Other replicates are presented in Additional File 1 (Additional File: Fig. S6)
Since the primary function of dockerin domains is to bind to scaffoldin and attach the protein to the cellulosome, the impact of the dockerins on NcaEXLX1 binding to CNF was unexpected. Previous studies have explored the impact of fungal and bacterial dockerins on cellulose adsorption. Two cellulosomal xylanases from N. frontalis exhibited no affinity towards cellulose (Avicel) despite having dockerins that are structurally homologous and share 47–67% identity to those of NcaEXLX1 [55]. Similarly, no adsorption to cellulose was reported for a dockerin-containing xylanase and mannanase from Piromyces sp. [14]. Despite the considerable differences between fungal and bacterial dockerins, it is worth noting that a bacterial dockerin from a Ruminococcus flavefaciens endoglucanase [59] and Clostridium thermocellum scaffoldin [60] also lacked affinity towards cellulose. The higher molecular weight of NcaEXLX1 compared to the truncated protein likely contributed to the greater change in frequency observed during its adsorption. However, this alone does not fully explain the greater frequency change, since the molecular weight of NcaEXLX1 is only about twice that of NcaEXLX1tr. It is conceivable that the impact of dockerins on NcaEXLX1 binding to CNF is also mediated through the likely glycosylation of the linker that connects the dockerins to the D2 domain. Payne et al. [61] demonstrated through molecular dynamic simulations that glycosylated linkers can directly bind to cellulose. They then experimentally confirmed the simulations, showing higher binding to crystalline cellulose by the CBM and glycosylated linker of Trichoderma reesei exoglucanase (TrCel7A) compared to the CBM alone [61].
In addition to the frequency, dissipation was measured to estimate the impacts of the proteins on the viscoelastic properties of the film. No significant change in dissipation was observed during or after the adsorption of NcaEXLX1tr; the ΔD was below 1·10^−6^. On the other hand, after the adsorption of NcaEXLX1, the ΔD was 2.6·10^−6^, suggesting the film became slightly softer and more viscous.
Impact on endoglucanase activity
Earlier studies show the potential of non-cellulosomal expansin-like proteins to boost the activity of the Cel7B endoglucanase from T. longibrachiatum [3, 26]. Accordingly, Cel7B was used herein to investigate whether the cellulosomal expansin-like protein NcaEXLX1 could similarly enhance cellulolytic activity, as might be expected given its presence in the cellulosome. Both wild-type and truncated forms of NcaEXLX1 were included to evaluate the impact of substrate binding on the protein’s performance. During the pretreatment, approximately 216 ng/cm^2^ of NcaEXLX1 and 54 ng/cm^2^ of NcaEXLX1tr bound to the CNF film (Fig. 4, Additional File 1: Fig. S7). The initial adsorption rate of Cel7B was then calculated to determine how the CNF pretreatment impacted cellulase adsorption. Although Zhang et al. [66] reported an increase in Cel7B adsorption on cellulose (Avicel) following pretreatment with BsEXLX1, the average initial adsorption rate of Cel7B measured here was 31 ng/cm^2^min and was not significantly impacted by either NcaEXLX1 or NcaEXLX1tr pretreatment. After stopping the flow of Cel7B, the frequency increased as the enzymatic hydrolysis of the CNF film exceeded the counteracting frequency decrease from protein adsorption. The frequency increase began approximately 10 min earlier for CNF films pretreated with NcaEXLX1 and NcaEXLXtr compared to the film without pretreatment. Pretreatment of the CNF films also showed a statistically significant increase in the initial hydrolysis rate of Cel7B by 172% and 190% for NcaEXLX1 and NcaEXLX1tr, respectively (Additional File 1: Table S1). AFM images of the sensors were taken after the hydrolysis to evaluate the final extent of CNF deconstruction. Consistent with the QCM-D measurements, the AFM images revealed fewer fibers on sensors that were pretreated with NcaEXLX1 or NcaEXLXtr (Fig. 5; Additional File 1: Figs. S8, S9). These analyses revealed that while the dockerin domains impacted NcaEXLX1 adsorption to CNF, they apparently did not impact the potential of the proteins to boost cellulase performance. This observation is consistent with earlier reports showing low correlation between cellulose binding and the potential of an expansin-like protein to boost cellulase activity [26].Fig. 4QCM-D measurements of the hydrolysis of a CNF film by Cel7B with and without expansin pretreatment**. A** Frequency from third overtone. B Dissipation from third overtone. Experiments were performed at 25 °C in 50 mM sodium acetate (pH 5.0) buffer with a flow rate of 100 μL/min. Pretreatment was done using 1 µM of NcaEXLX1 (blue), NcaEXLX1tr (green), or buffer alone (black). After the pretreatment, the films were rinsed with buffer for 30 min, followed by a 30 min flow of 0.025 mg/mL Cel7B, after which the flow was stopped, and the hydrolysis was monitored for up to 4 h. All experiments were repeated 6–7 times. Data shown here are representative of the data set. Other replicates are presented in Additional File 1 (Additional File 1: Fig. S7)Fig. 5AFM images of sensors before and after QCM-D. Scan size is 10 × 10 µm^2^. CNF-coated sensor before QCM-D A, after treatment with Cel7B alone B, after pretreatment with NcaEXLX1tr and following hydrolysis with Cel7B C, and after pretreatment with NcaEXLX1 and following hydrolysis with Cel7B D. Images presented here are the average representation of six AFM images taken from each sensor. The rest of the images are presented in Additional File 1 (Additional File 1: Figs. S8, 9)
Although the sensor treated with NcaEXLX1 and Cel7B showed fewer fibers than the sensor treated only with Cel7B, both sensors had similar dissipation values of around 3·10^−6^ at the end of the hydrolysis (Fig. 4B). As the CNF erodes during hydrolysis, dissipation typically decreases. Since the CNF film treated with NcaEXLX1 and Cel7B had less fiber compared to the one treated with Cel7B only, the comparatively high dissipation values observed with the NcaEXLX1 pretreated CNF may result from a protein layer adsorbed on the CNF, since the NcaEXLX1 pretreated CNF film had the highest amount of protein adsorbed. CNF pretreated with NcaEXLX1tr had lower dissipation values (ΔD ~ − 2.6·10^−6^) as would be expected based on the lower number of fibers on the film.
The improvements seen with NcaEXLX1 and NcaEXLX1tr are consistent with previous QCM-D studies that show enhancement of Cel7B activity after cellulose pretreatment with expansin-like proteins [26, 66]. For example, previous QCM-D studies that investigated the action of the fungal expansin-like protein AmaEXLX1 from Allomyces macrogynus and ApuEXLX1 from Aureobasidium pullulans show their potential to enhance Cel7B hydrolysis of CNF [26]. Following the same method used herein to calculate the percent increase in initial rates of Cel7B activity, it is estimated that AmaEXLX1 and ApuEXLX1 increased Cel7B activity by 172% and 112%, respectively [26]. Such improvements are similar to those observed following CNF pretreatment with NcaEXLX1, indicating that the cellulosomal expansin-like protein did not to enhance cellulase activity above what has been observed for expansin-like proteins derived from aerobic fungi.
The bacterial expansin-like protein BsEXLX1 has also been reported to improve Cel7B activity, achieving up to a fivefold increase in the initial hydrolysis rate compared with Cel7B alone [66]. This enhancement exceeds that observed here; however, direct comparison is difficult given that Avicel rather than CNF was used in corresponding QCM-D studies. Notably, the bacterial cellulosomal expansin-like proteins Clocl_1298 and Clocl_1862 from C. clariflavum improve the cellulolytic activity of wild-type cellulosomes from C. thermocellum and C. clariflavum, a designer cellulosome containing endoglucanase (Cel9F) and exoglucanase (Cel9K), as well as a mixture of GH9 endoglucanase and GH48 exoglucanase on various cellulosic substrates, including Avicel, filter paper, microcrystalline cellulose [25, 27]. The impact of bacterial cellulosomal expansin-like proteins on endoglucanase alone, however, remains unexplored.
Conclusions
Sequence analysis of the expansin-like proteins from N. californiae revealed no major differences in the D1 domain between cellulosomal and non-cellulosomal expansin-like proteins. A notable distinction in the D2 domains, however, was the absence of the first conserved aromatic residue of the PBS in non-cellulosomal expansin-like proteins. The full length NcaEXLX1 with its dockerins was shown to have higher affinity towards CNF compared to the truncated version of the protein without its dockerins, NcaEXLX1tr. Despite the differences in adsorption, both NcaEXLX1 and NcaEXLX1tr were shown to enhance the activity of Cel7B. The presence of expansin-like proteins in cellulosomes and their ability to improve cellulase activity, as demonstrated here, supports a biological role for at least some expansin-like proteins in boosting the enzymatic deconstruction of cellulosic substrates. The improvements in hydrolysis observed with the dockerin-containing expansin prompt the question of whether NcaEXLX1 would have even higher impact on dockerin-containing endoglucanases from N. californiae over free enzymes from the same family, suggesting co-evolution of NcaEXLX1 with other cellulosomal enzymes.
Supplementary Information
Supplementary Material 1.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Gupta R, Brunak S (2002) Prediction of glycosylation across the human proteome and the correlation to protein function. Pac Symp Biocomput. pp. 310–2211928486 · pubmed ↗
