Transcriptome-guided discovery and active-site gatekeeper engineering of a Viola arcuata asparaginyl ligase with superior catalytic performance
Qiongyan Zou, Yujiao Yan, Xinglei Zou, Peng Jiang, Jiongwen Qin, Xue Tang, Fawei He, Dongting Zhangsun, Sulan Luo, Yong Wu

TL;DR
Researchers discovered and improved a new enzyme from Viola arcuata that efficiently creates peptide macrocycles, especially under mild conditions.
Contribution
A novel and highly efficient asparaginyl ligase, VaPAL2(I244A), was engineered with superior catalytic performance and operational robustness.
Findings
VaPAL2(I244A) shows rapid macrocyclization of GN-type substrates and strong ligation bias on branched sequences.
The enzyme exhibits improved tolerance to organic cosolvents like 20% DMSO and retains activity at near-neutral pH.
Structural modeling reveals that the I244A substitution widens the substrate corridor while preserving catalytic architecture.
Abstract
Peptide macrocyclization by ligase-type asparaginyl endopeptidases (AEPs) underpins many emerging applications in peptide and protein engineering, yet only a few recombinant ligases currently offer suitable catalytic performance and operational robustness. Here we characterize VaPAL2, a previously unreported AEP from Viola arcuata, and show that its activity can be substantially enhanced through a single gatekeeper substitution. The engineered variant, VaPAL2(I244A), functions as a highly efficient peptide ligase, displaying rapid macrocyclization of GN-type substrates, strong ligation bias on branched sequences, and markedly improved tolerance to organic cosolvents such as 20% DMSO. Notably, VaPAL2(I244A) retains its high activity at near-neutral pH, making it compatible with preparative and protein-compatible conditions. Structural modeling indicates that the I244A substitution…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiochemical and Structural Characterization · Antimicrobial Peptides and Activities · Peptidase Inhibition and Analysis
Ribosomally synthesized and posttranslationally modified peptides comprise diverse families of natural products that rely on enzyme-mediated modifications to achieve structural stability and biological activity (1, 2, 3, 4, 5, 6). Among them, plant cyclotides possess a characteristic head-to-tail cyclic cystine-knot topology that confers exceptional protease resistance and has enabled their increasing use as scaffolds in chemical biology and peptide engineering (7, 8, 9, 10). However, efficient in vitro macrocyclization of synthetic or engineered cyclotide precursors remains technically challenging, highlighting the need for enzymes capable of catalyzing peptide backbone cyclization with high efficiency.
Cyclotides are expressed as precursor proteins containing an N-terminal leader peptide, a cyclotide domain, and a C-terminal follower peptide. Their maturation requires two proteolytic processing events, the second of which—cleavage at a conserved Asn/Asp residue followed by transpeptidation—is carried out by members of the asparaginyl endopeptidase (AEP) family (11, 12, 13). When transpeptidation predominates over hydrolysis, these enzymes function as peptide asparaginyl ligases (PALs). A small subset of PALs—including butelase-1 (14), OaAEP1b (15), VyPALs (16), and HaAEP1 (17)—display high catalytic efficiency and have become important tools for peptide macrocyclization (18, 19, 20, 21, 22). Nevertheless, ligase-competent AEPs represent only a minor fraction of the large AEP superfamily, and robust ligase activity is not uniformly present even in cyclotide-producing species. For example, endogenous tobacco AEPs predominantly yield linear products and require co-expression of an exogenous ligase to restore cyclization (23).
Although recent structural studies have elucidated key features underlying AEP–PAL divergence—such as transpeptidation mechanisms, prime-side stabilization, and stabilization of the S-acyl intermediate—the molecular determinants governing ligase-biased catalysis across species remain incompletely defined (15, 16, 24, 25, 26, 27). Early models emphasized the role of a single “gatekeeper” residue at the S2 position (15), whereas more recent work has highlighted composite contributions from surrounding structural elements, including motifs flanking the S1′–S2 region such as LAD1 and LAD2 (28, 29, 30). However, most existing insights are derived from a limited number of ligases examined under heterogeneous experimental conditions, making it difficult to generalize how subtle structural perturbations translate into coordinated changes in reaction partitioning, catalytic efficiency, and operational robustness. A broader and systematically benchmarked exploration of ligase-competent AEPs is therefore needed.
In this study, we systematically identified 11 AEP homologs from a Viola arcuata transcriptome previously generated in our laboratory (31). Among four cloned candidates, only VaPAL2 could be acid-activated to its mature form and exhibited clear ligase activity. Guided by structural considerations, we introduced a single amino-acid substitution at the S2 gatekeeper position (I244A), yielding a variant with substantially enhanced catalytic efficiency and a pronounced shift toward ligation over hydrolysis. Using the widely studied ligase OaAEP1b(C247A) (15) as a recombinant benchmark, we show that VaPAL2(I244A) achieves a catalytic turnover number (kcat) of up to 16.9 s^-1^ on GN-type substrates, displays strong ligation-biased processing of GN14-SL, and retains near-complete activity in 20% dimethyl sulfoxide (DMSO) while maintaining an optimal pH of 7.0 to 7.5. Together, these results establish VaPAL2(I244A) as the first ligase characterized from V. arcuata and provide mechanistic and functional insight into how minimal structural perturbations within the S1′–S2 region can coordinately enhance ligase performance.
Results
Gatekeeper-guided mining of V. arcuata AEPs and initial functional assignment
Building on our earlier observation that V. arcuata produces a rich repertoire of plant cyclotides (31), we next asked whether its transcriptome also encodes the AEPs that support such ligation chemistry. To this end, we searched a de novo fruit-tissue transcriptome using previously reported Viola AEPs, including VbAEP1–4 from Viola betonicifolia and VpAEP2–4 from Viola philippica, as queries. This search yielded eleven AEP-like transcripts (Data S1). Each showed ∼90 to 98% amino-acid identity to its closest Viola homolog, indicating that V. arcuata carries an AEP complement comparable to other Violaceae and providing a suitable pool for ligase-oriented screening. We then asked which of these 11 candidates actually bear sequence features that have been experimentally linked to ligase bias.
Previous mutational analyses of OaAEP1b, informed by the published crystal structure (PDB 5H0I), demonstrated that replacement of the S2 gatekeeper residue Cys247 with bulkier amino acids (Thr, Met, Val, Leu, Ile) reduces ligation efficiency, whereas substitution with a smaller residue such as Ala markedly enhances ligation, while mutation to Gly increases hydrolysis (15). These studies identified the S2 gatekeeper residue as an important determinant of the transpeptidation–hydrolysis balance, although the underlying structural mechanism remains incompletely understood. In parallel, comparative analyses of ligase-type plant AEPs reported two ligase-associated determinants (LAD1 and LAD2) flanking the S1′/S2 region, together with cap-region PPL/MLA motifs that are proposed to contribute to a ligase-permissive active-site entrance. We therefore aligned the 11 V. arcuata sequences with a reference panel of ligase-type AEPs (butelase-1, OaAEP1b, VyPALs) and hydrolytic plant AEPs and inspected these positions (Fig. 1A). Five sequences carried an Ile at the S2 gatekeeper and simultaneously retained ligase-favoring LAD1/LAD2 combinations and cap support; we reannotated these as VaPAL1–5 and considered them the highest-priority ligase candidates. The remaining six sequences possessed smaller or more permissive gatekeepers (Gly or Val) and exhibited one or more protease-leaning changes in the LAD/cap region. We therefore retained the VaAEP designation for these sequences; those that retained partial ligase-like features were annotated as “uncertain” rather than strictly hydrolytic.Figure 1Gatekeeper-guided mining of Veronica arcuata AEPs and ligase-activity signatures. A, domain organization (SP, NTD, Core, Linker, Cap) and multiple-sequence alignment of representative PALs/AEPs with the V. arcuata hits identified here. The catalytic triad (Asn–His–Cys) is shaded dark gray. S1-pocket residues are shaded blue. LAD residues are boxed red, with LAD1 (S2)/LAD2 (S1′) positions indicated; the gatekeeper column is marked above LAD1. The conserved disulfide near LAD1 is highlighted orange; the polyproline loop (PPL) and MLA regions are boxed green and purple, respectively. Residues associated with ligase-favoring patterns are colored in blue tones and protease-favoring residues in red tones. The complete sequence alignment can be found in Figure S1. B, maximum-likelihood phylogeny inferred from amino-acid sequences (circular layout). Branch lengths represent expected substitutions per site (scale bar shown). The outer color strip encodes genus-level categories (legend at left); n indicates the number of tips per genus. Background shading highlights key sequences: red, macrocyclase references (butelase-1, OaAEP1b); light green, VaPAL clade(s); yellow, VaAEP clade(s).
To test whether our sequence-based fingerprint (gatekeeper + LAD1/LAD2 + cap elements) reliably predicts ligation competence, we performed maximum-likelihood phylogenetic analysis of all 11 V. arcuata AEPs together with functionally characterized ligases (butelase-1, OaAEP1b, VyPALs) and representative hydrolytic legumains from angiosperms, rooted on human legumain (Fig. 1B).
The resulting tree fully corroborated the sequence classification. VaPAL1–4, all possessing an Ile-type S2 gatekeeper and ligase-like LAD1/LAD2 signatures, formed a coherent subclade tightly nested within the known ligase clade containing butelase-1 and OaAEP1b. A second V. arcuata cluster (VaAEP4, VaAEP5, VaAEP6) branched immediately basal to this ligase group, consistent with their partial retention of ligase-associated motifs. In contrast, VaAEP2 and VaAEP3 grouped with typical hydrolytic plant legumains characterized by restrictive cap regions and hydrolase-biased LAD signatures.
Thus, the phylogeny independently validates our sequence fingerprint and reveals that V. arcuata encodes a discrete, evolutionarily coherent ligase-enriched subset within a predominantly hydrolytic AEP repertoire.
Expression, purification, and acid activation of VaPAL2, VaPAL2(I244A), OaAEP1b(C247A) and related V. arcuata AEPs
From a previously generated V. arcuata leaf transcriptome, we identified 11 AEP homologs. Only four of these yielded full-length coding sequences that could be reliably amplified and successfully cloned: VaAEP2, VaAEP3, VaAEP4, and VaPAL2. The ORFs were PCR-amplified with primers introducing NdeI and XhoI sites and ligated into the pET-28a(+) vector to produce N-terminal His_6_-tagged constructs (Fig. S2). Because reducing the size of the S2 gatekeeper in OaAEP1b was previously shown to bias catalysis toward ligation, we applied the same rationale to the V. arcuata ligase candidate. After selecting VaPAL2 from the transcriptome, we introduced the corresponding Ile→Ala substitution at position 244 to obtain VaPAL2(I244A). In parallel, we expressed the benchmark ligase OaAEP1b(C247A) in the same host so that all enzymes could be processed under identical conditions, and we included VaAEP2/3/4 to assess whether the more hydrolytic V. arcuata members would undergo activation in our workflow.
We expressed and purified six enzymes in Escherichia coli—VaPAL2, VaPAL2(I244A), OaAEP1b(C247A), and VaAEP2/3/4—as N-terminal His_6_-tagged zymogens using Ni^2+^ affinity chromatography followed by size-exclusion chromatography (Fig. 2A). Under these shared conditions, the two Viola ligase candidates, VaPAL2 and VaPAL2(I244A), eluted as single, symmetric size-exclusion chromatography peaks and appeared as single ∼52-kDa bands on SDS–PAGE, consistent with the reported zymogen size of Viola AEPs (Fig. 2A). OaAEP1b(C247A) likewise yielded a single band at ∼59 kDa, consistent with its reported proenzyme mass (Fig. 2A). Culture-normalized yields were ∼0.37 mg L^-1^ for VaPAL2, ∼0.74 mg L^-1^ for VaPAL2(I244A), and ∼1.4 mg L^-1^ for OaAEP1b(C247A). VaAEP2, VaAEP3, and VaAEP4 were likewise obtained as single bands, albeit in lower overall amounts (Fig. S3A).Figure 2Production and acid activation of VaPAL2 and related AEPs. A, SEC purification profiles of His-tagged VaPAL2, the gatekeeper variant VaPAL2(I244A), and the benchmark OaAEP1b(C247A). Each enzyme eluted as a single major peak under the same workflow; insets show SDS–PAGE of the peak fractions, giving a single band at the expected proenzyme size (∼52 kDa for VaPAL2 and VaPAL2(I244A); ∼59 kDa for OaAEP1b(C247A). B, domain schematic and acid-activation model for VaPAL2. The zymogen consists of an N-terminal tag, core domain, linker, and cap domain; acid treatment (pH 2) cleaves the linker and releases the active core. Right, AlphaFold model colored by domains. C, SDS–PAGE of VaAEP2/3/4 and VaPAL2 after applying the same acid-activation protocol. VaPAL2 shifts from ∼52 kDa (proenzyme; red arrowhead) to ∼33 kDa (active core; blue), whereas VaAEP2, VaAEP3, and VaAEP4 remain at the proenzyme size and show no detectable activation. D, pH-dependent processing of VaPAL2 (pH 2.0–7.4). Robust accumulation of the ∼33 kDa active core is observed at pH ≤ 3.0, while the sample kept at pH 7.4 remains as zymogen. Acid-activated fractions were used in subsequent ligation and macrocyclization assays.
Plant AEPs are synthesized as inactive zymogens targeted to the vacuole and require acidic pH to autocatalytically remove the C-terminal propeptide (cap/linker), thereby exposing the active site (30, 31). Consistent with this mechanism, both VaPAL2 and VaPAL2(I244A) were efficiently converted from the ∼52-kDa zymogen to the ∼33-kDa active core when incubated at low pH (Fig. 2C). Activation was essentially complete at pH ≤ 4.0 and negligible at pH ≥ 6.0; samples kept at pH 7.4 remained entirely as zymogen. The benchmark enzyme OaAEP1b(C247A) exhibited an almost identical pH-dependent activation profile, with full conversion to the active ∼33-kDa form at pH ≤ 4.0 and no detectable processing at neutral or slightly acidic pH (Fig. S3C). Unless otherwise stated, all subsequent ligation and macrocyclization assays were performed using these acid-activated core enzymes.
In striking contrast, the three protease-like isoforms VaAEP2, VaAEP3, and VaAEP4 showed no detectable shift from their ∼52-kDa proenzyme form across the entire pH range tested (Fig. 2C).
VaPAL2(I244A) surpasses VaPAL2 and OaAEP1b(C247A) in GN10-SL macrocyclization
To confirm that the recombinant enzymes prepared above were catalytically competent under our standardized workflow, we first assayed them on GN10-SL. GN10-SL is a model peptide derived from MCoTI-I/II (–GYSGSDAL) that has been reformatted into a ligase-friendly reporter (32). The substrate contains a minimally structured GN-type N terminus and the canonical –NSL exit motif; after cleavage at the P1 Asn, the liberated Gly rapidly attacks the thioacyl intermediate, yielding a single cyclic product (cGN10) (Fig. 3A). This single-pathway design minimizes sequence- or folding-derived effects and enables direct comparison of intrinsic catalytic performance among the enzymes.Figure 3GN10-SL macrocyclization by VaPAL2, VaPAL2(I244A), and OaAEP1b(C247A). A, schematic of GN10-SL undergoing backbone cyclization to form cGN10 catalyzed by ligase-type AEPs. Cleavage at the P1 Asn generates a thioacyl intermediate that is resolved by intramolecular attack of the N-terminal Gly, yielding a single cyclic product. B, UPLC (214 nm) time-course analysis of GN10-SL macrocyclization catalyzed by VaPAL2, VaPAL2(I244A), and OaAEP1b(C247A). All enzymes were expressed, acid-activated, and assayed side-by-side under identical conditions. Traces collected at 0, 10, 30, and 60 min are shown. Green and blue shaded regions indicate the retention windows of GN10-SL and cGN10, respectively. VaPAL2(I244A) achieves near-complete conversion within 10 min, whereas WT VaPAL2 and OaAEP1b(C247A) retain detectable substrate at the same time point. C, Michaelis–Menten analysis of GN10-SL cyclization by the three enzymes. Initial velocities were determined over a substrate concentration series and fitted by nonlinear regression. Assays were performed using enzyme concentrations of 50 nM for VaPAL2(I244A), 100 nM for VaPAL2, and 200 nM for OaAEP1b(C247A). Fitted kinetic parameters (kcat and Km) are reported as mean ± fitting error and were normalized to the corresponding active enzyme concentrations. D, pH dependence of GN10-SL macrocyclization determined from endpoint conversions (mean ± SD, n = 3). VaPAL2(I244A) exhibits the highest activity across the examined pH range and retains robust activity under near-neutral to mildly alkaline conditions.
VaPAL2, its gatekeeper variant VaPAL2(I244A), and the benchmark ligase OaAEP1b(C247A) were expressed, acid-activated, and assayed side-by-side so that any observed differences could be attributed solely to enzyme-intrinsic properties. Ultra-performance liquid chromatography (UPLC) time-course analysis revealed a single cGN10 product peak for all three enzymes with no detectable side products (Fig. 3B). However, the rates differed substantially: after 10 min, VaPAL2(I244A) achieved >90% conversion of GN10-SL, whereas WT VaPAL2 and OaAEP1b(C247A) retained appreciable amounts of precursor. Product identity was confirmed by electrospray ionization mass spectrometry (ESI–MS) (Fig. S4, A–D).
To distinguish genuine catalytic enhancement from differences in reaction endpoint, we next determined initial velocities across a substrate concentration series and fitted the data to the Michaelis–Menten model using enzyme concentrations of 50 nM for VaPAL2(I244A), 100 nM for VaPAL2, and 200 nM for OaAEP1b(C247A) (Fig. 3C). Relative to WT VaPAL2, VaPAL2(I244A) exhibited an approximately 5-fold higher turnover number (kcat = 16.9 ± 0.8 s^-1^ versus 3.5 ± 0.2 s^-1^) together with a modestly reduced Michaelis constant (Km) (87 ± 20 μM vs 140 ± 30 μM), resulting in an approximately 8-fold increase in catalytic efficiency (kcat/Km = 1.9 × 10^5^ vs 2.5 × 10^4^ M^-1^ s^-1^). When compared under identical assay conditions, VaPAL2(I244A) also displayed substantially higher turnover and catalytic efficiency than OaAEP1b(C247A) (kcat = 2.3 ± 0.3 s^-1^, kcat/Km = 1.8 × 10^4^ M^-1^ s^-1^), establishing VaPAL2(I244A) as the most efficient of the three recombinant ligases tested on this substrate.
We next examined the pH dependence of GN10-SL macrocyclization under otherwise identical conditions (Fig. 3D). VaPAL2(I244A) exhibited the highest activity across the entire pH range tested, reaching maximal conversion at pH 7.5 and maintaining 81 to 86% conversion between pH 7.0 and 8.0. In contrast, WT VaPAL2 reached a maximum of 68.3% at pH 7.0, whereas OaAEP1b(C247A) showed a broader but lower activity plateau centered around pH 6.5 to 7.0. All three enzymes showed reduced activity below pH 6.0 and above pH 8.0. Notably, VaPAL2(I244A) retained more than half of its maximal activity at pH 8.5, whereas WT VaPAL2 and OaAEP1b(C247A) retained only 20.1% and 11.4%, respectively.
Taken together, the I244A substitution in VaPAL2 accelerates productive macrocyclization, modestly improves apparent substrate affinity, and expands the effective operating pH range toward near-neutral and mildly alkaline conditions.
VaPAL2(I244A) redirects the GN14-SL reaction from hydrolysis to ligation
To evaluate whether the I244A substitution affects not only reaction rate but also product partitioning, we next used GN14-SL, a branching reporter that yields two separable products. GN14-SL carries the tripeptidic AEP-recognition motif –NSL at its C terminus, which is found in the precursors of Viola cyclotides and in sunflower trypsin inhibitor-1 analogs (33). With this design, the substrate can undergo either backbone cyclization or hydrolysis (Fig. 4A), producing two chromatographically separable products that allow direct quantification of ligation-versus-hydrolysis partitioning. Representative UPLC traces and ESI–MS confirmation of both products for all three enzymes are shown in Figure 4B and Fig. S4, E–F.Figure 4VaPAL2(I244A) redirects GN14-SL from hydrolysis to ligation. A, reaction scheme: GN14-SL forms either cG14 (ligation) or GN14 (hydrolysis). B, UPLC (214 nm) time courses (0, 10, 30, 60 min) for VaPAL2, VaPAL2(I244A), and OaAEP1b(C247A) at matched enzyme loading. Shaded windows: GN14-SL (blue), cG14 (green), GN14 (red). C, endpoint fractions across pH 5.0 to 7.5 (mean ± SD, n = 3): blue bars = cyclization fraction; red bars = hydrolysis fraction. Quantification from UPLC peak areas. Full definitions and statistics are in Experimental procedures.
Product-distribution analysis showed that VaPAL2(I244A) consistently favored backbone cyclization over hydrolysis relative to WT VaPAL2 across all pH values examined (Fig. 4C). VaPAL2(I244A) showed a markedly stronger preference for cyclization than WT VaPAL2. For VaPAL2, cyclization yields were 31.9% at pH 5.0, 28.6% at pH 5.5, 29.0% at pH 6.0, and they peaked at 39.7% at pH 6.5; the corresponding hydrolysis yields were 16.0%, 15.5%, 7.3%, and 1.7%, respectively (Fig. 4C). By contrast, VaPAL2(I244A) delivered higher cyclization at every pH tested—34.7% at pH 5.0, 37.8% at pH 5.5, 37.9% at pH 6.0, and 44.0% at pH 6.5—while simultaneously suppressing hydrolysis to 4.6%, 3.4%, 0.5%, and 0%, respectively. The effect was most pronounced between pH 6.0 and 7.0, where the I244A variant achieved 38 to 44% cyclization with ≤0.5% hydrolysis, whereas WT VaPAL2 reached only 29 to 40% cyclization and still produced 7.3–1.7% hydrolysis. Above pH 7.0, both enzymes lost activity, but VaPAL2(I244A) still produced higher amounts of cyclic product (28.8% at pH 7.5; 12.1% at pH 8.0) than WT VaPAL2 (21.3% and 9.1%, respectively), and hydrolysis remained undetectable for the variant in this range. OaAEP1b(C247A) produced the same two products under these conditions but did not match the ligation bias achieved by VaPAL2(I244A).
Overall, the I244A substitution not only accelerates turnover but also significantly shifts partitioning toward ligation across the entire practical pH range, with VaPAL2(I244A) achieving the highest ligation bias among the three enzymes tested.
Substrate scope and operational tolerance of VaPAL2 and VaPAL2(I244A)
To define the practical substrate scope of VaPAL2 and its engineered variant VaPAL2(I244A) more precisely, we constructed a substrate panel that systematically probed four recognition features relevant to native plant cyclotide biosynthesis: (i) dependence on a canonical C-terminal exit tag, (ii) P1 residue identity (Asn versus Asp), (iii) tolerated cyclotide ring size, and (iv) ability to process folded, native-like precursors (Fig. 5A). Unless otherwise noted, reactions were allowed to proceed until product formation reached a plateau, as determined by UPLC analysis, and yields are reported at this fixed reaction endpoint.Figure 5**Substrate scope of VaPAL2, VaPAL2(I244A), and OaAEP1b(C247A).**A, summary table of the peptide panel used to probe tag dependence, P1 specificity, minimal ring size, and compatibility with native cyclotide precursors. The table lists substrate names, sequences, nominal ring size, and fixed reaction endpoints (defined as the time point at which product formation reached a plateau as assessed by UPLC analysis). Notes: (a) GN4-SL predominantly forms a cyclic dimer under assay conditions; (b) GN10-AL and Kalata B1-GL (lacking the canonical –NSL motif) showed no detectable cyclization by UPLC; (c) MCoTI-II precursor is a model linear sequence derived from MCoTI-I/II (–GYSGSDAL). B, endpoint macrocyclization yields (%) catalyzed by VaPAL2(I244A), VaPAL2, and OaAEP1b(C247A). Bars represent mean ± SD (n = 3).
The enzymes exhibited a strict requirement for an SL-type C-terminal dipeptide. GN10-SL, which carries the consensus –NSL motif found in most Viola and Oldenlandia cyclotide precursors, was cyclized efficiently by all three enzymes (74–92% yield). Replacing the terminal Ser–Leu with Ala–Leu (GN10-AL) or Gly–Leu (as in kalata B1-GL) completely abolished detectable cyclization (<2% after 4 h), confirming that an SL dipeptide is essential for productive recognition.
P1 specificity was examined by comparing GN10-SL (P1 = Asn) with its Asp analog GD10-SL. Under identical conditions, GN10-SL reached ∼84% conversion at the reaction plateau, whereas GD10-SL yielded only ∼11% cyclized product (Fig. 5B). This result establishes VaPAL2 as a strongly Asn-preferring ligase, in agreement with the predominance of Asn at the P1 position in natural Viola cyclotide precursors.
Ring-size tolerance was evaluated using linear precursors ranging from 4 to 34 residues (GN4-SL, GN5-SL, GN10-SL, GN14-SL, cyO14, Viar-A, Mra-30, and MCoTI-II). Both VaPAL2 and VaPAL2(I244A) efficiently cyclized substrates containing 5 to 34 residues in the mature cyclotide domain. The shortest substrate, GN4-SL, was converted primarily into a cyclic dimer, showing that when monomeric cyclization becomes energetically unfavorable, the enzyme can productively engage a second peptide molecule—a useful alternative pathway previously observed with butelase-1 (14).
The ability to process structurally complex, native-like precursors was tested using the linear, fully reduced forms of four authentic cyclotides (cyO14, Viar-A, Mra-30, and MCoTI-II; 30–34 residues, six cysteines each). VaPAL2(I244A) consistently gave the highest yields (37–62%), outperforming WT VaPAL2 (24–44%) and the benchmark OaAEP1b(C247A) (25–46%) by factors of 1.4 to 2.6 (Fig. 5B). Thus, the I244A variant shows markedly superior activity on structurally complex cyclotide substrates.
All substrates shown in Figure 5A and their cyclic products generated by VaPAL2, VaPAL2(I244A), and OaAEP1b(C247A) were analyzed by UPLC and ESI–MS; observed masses agreed with the expected cyclic structures within ±0.5 Da (Figs. S5–S11).
In conclusion, VaPAL2(I244A) outperforms both the parent VaPAL2 and the recombinant reference enzyme OaAEP1b(C247A) across all tested parameters examined here, with the most pronounced advantages (up to 2.6-fold higher yields) observed with native cyclotide sequences. These results highlight its superior practical utility as a biocatalyst for peptide macrocyclization.
VaPAL2(I244A) displays superior tolerance to organic cosolvents
To assess whether the I244A substitution improves operational robustness for poorly soluble substrates, GN10-SL macrocyclization was evaluated in the presence of representative organic modifiers under identical reaction initiation, incubation time, and quenching conditions (Fig. 6A). WT VaPAL2 showed a pronounced loss of the cGN10 product peak in alcohols and was almost completely inhibited in 20% DMSO. In contrast, VaPAL2(I244A) retained a chromatographic profile closely resembling the buffer control, indicating preserved catalytic competence across all tested cosolvents. To verify that the apparent DMSO tolerance of VaPAL2(I244A) does not arise from differential oxidative artifacts, we performed a time-controlled pre-incubation assay in 20% DMSO (Fig. S12). VaPAL2(I244A) efficiently cyclized GN10-SL irrespective of DMSO pre-incubation duration, whereas VaPAL2 and OaAEP1b(C247A) both exhibited very weak activity under all conditions. Product identities were confirmed by ESI–MS (Fig. S13), demonstrating that the enhanced activity of VaPAL2(I244A) in DMSO reflects intrinsic solvent tolerance.Figure 6Cosolvent tolerance and preparative window. A, representative UPLC chromatograms (214 nm) for VaPAL2, VaPAL2(I244A), and OaAEP1b(C247A) in buffer only and in the indicated cosolvents (for example, 20% MeOH, 25% EtOH, 25% isopropanol, 20% DMSO, v/v). Shaded windows indicate cGN10 and GN10-SL. B, residual activity (%) for each enzyme in the same cosolvent panel, expressed relative to the buffer-only control (mean ± SD, n = 3). Residual activity was derived from UPLC endpoint yields of cGN10. MeOH, methanol; EtOH, ethanol.
Quantitative analysis (Fig. 6B) confirmed this trend: VaPAL2(I244A) retained full activity in 20% methanol, >70% activity in 25% ethanol and isopropanol, and nearly complete activity in 20% DMSO—a condition under which OaAEP1b(C247A) was moderately inhibited and VaPAL2 was almost inactive under the same exposure duration.
Thus, the single I244A gatekeeper substitution not only enhances catalytic efficiency and ligation bias but also markedly broadens the effective solvent window, enabling robust macrocyclization under controlled cosolvent conditions commonly required for hydrophobic or poorly soluble peptide substrates.
Structural basis for enhanced ligation efficiency of VaPAL2(I244A)
To determine whether the functional enhancement of VaPAL2(I244A) arises from local structural adjustments at the engineered S2 gatekeeper site, we generated homology models of VaPAL2 and VaPAL2(I244A) and compared them with the high-activity reference OaAEP1b(C247A) (Fig. 7A). The catalytic triad and overall core fold were nearly identical in all models, indicating that the I244A substitution does not perturb catalytic geometry.Figure 7Gatekeeper-dependent widening of the S1′→S2 corridor in VaPAL2. Numbering note: Panels B–D use the numbering of the activated VaPAL2 core (post-processing); for example, gatekeeper I/A193 corresponds to zymogen I244, and the S1′ wall is residues 121 to 123. A, structural superposition of OaAEP1b(C247A) (cap, pink; core, tv_green; catalytic triad, magenta; gatekeeper, green), VaPAL2(cap, pink; core, gray; catalytic triad, red; gatekeeper, yellow) and VaPAL2(I244A) (cap, orange; core, light green; catalytic triad, blue; gatekeeper, cyan). The red box marks the catalytic-triad region that is enlarged/analyzed in B–D. B, catalytic approach geometry in VaPAL2. In a pre-organized VaPAL2–GIPNSL complex, Cys195-SG is 3.5 Å from the backbone carbonyl carbon of the P1 Asn (black dashed line). C, S2-only cavity volumes near the gatekeeper (193) rendered with identical PyVOL settings (probe 1.4 Å; minimum pocket volume 70 Å^3^); values are annotated near the gatekeeper Cα. D, definition of W_min: the shortest distance between gatekeeper side-chain heavy atoms (I/A193) and the S1′ wall (Cα of residues 121–123). The nearest-pair distance is shown as a black dashed line; a semi-transparent surface provides corridor context.
Structural differences were localized to the S1′–S2 corridor. In WT VaPAL2, the bulkier Ile244 side chain partially narrows the passage between the LAD1 and LAD2 motifs. Substitution with Ala retracts this steric obstruction, yielding a smoother and wider channel for N-terminal nucleophile ingress (Fig. 7A). In the modeled VaPAL2–GIPNSL complex, the catalytic Cys Sγ was positioned ∼3.5 Å from the P1 Asn carbonyl carbon, consistent with a thioacyl intermediate competent for ligation (Fig. 7B).
Quantitative measurements supported this localized expansion. The minimal corridor width (W_min) between the gatekeeper site and the opposing S1′ wall increased from 9.09 Å in VaPAL2 to 10.22 Å in VaPAL2(I244A), representing an ∼1.1 Å (∼12%) widening at the position most relevant to nucleophile entry (Fig. 7D). PyVOL-derived cavity volumes also increased modestly (94.0→97.0 Å^3^), indicating a small but meaningful relaxation of the S2-proximal pocket (Fig. 7C).
Although modest in magnitude, these spatially concentrated changes occur precisely at the steric bottleneck governing substrate approach to the reactive thioacyl center. Such targeted relief of corridor constraints provides a plausible coherent structural explanation for the enhanced ligation efficiency of VaPAL2(I244A), illustrating how minimal gatekeeper engineering can tune the S1′–S2 landscape to promote efficient macrocyclization.
Significance
This study identifies VaPAL2(I244A) as the first ligase-type AEP from V. arcuata to be characterized in depth and establishes it as a high-performance macrocyclase generated through a single, rational gatekeeper substitution. The enhanced kinetics, ligation bias, and operational robustness of VaPAL2(I244A) highlight how minimal structural refinement of a naturally occurring AEP can yield an enzyme with properties that match or surpass those of widely used ligases.
By demonstrating that VaPAL2(I244A) achieves rapid macrocyclization, strong selectivity, and compatibility with practical reaction conditions, this work expands the catalog of functionally validated AEP ligases and provides a new, mechanistically informed benchmark for peptide macrocyclization and bioconjugation applications.
Discussion
Ligase-type AEPs have emerged as powerful biocatalysts for peptide macrocyclization, protein labeling, and bioconjugation, owing to their chemoselectivity and relaxed sequence requirements at the ligation site (14, 34). However, characterized AEPs exhibit inherent trade-offs in turnover rate, ligation specificity, pH tolerance, and stability in organic solvents (35, 36). To date, most recombinant ligases excel in only a subset of these desirable properties, rather than combining them simultaneously (37). Butelase-1, for instance, demonstrates exceptional catalytic efficiency but is difficult to produce recombinantly (25, 38, 39), while engineered OaAEP1b variants are readily expressed yet display moderate turnover (15, 40, 41). Additionally, many cyclotide-processing ligases operate optimally under mildly acidic conditions and tolerate organic cosolvents poorly (42). These constraints underscore the need for ligase-type AEPs with improved functional characteristics.
In this study, we identify VaPAL2(I244A) as the first ligase-type AEP from V. arcuata to be characterized in detail and show that a single, rationally introduced substitution can substantially enhance multiple functional parameters simultaneously. VaPAL2(I244A) displays rapid turnover on GN-type substrates, a pronounced preference for ligation over hydrolysis, robust activity near neutral pH, and exceptional tolerance to DMSO. Together, these properties establish VaPAL2(I244A) as a highly competitive recombinant AEP ligase relative to widely used engineered benchmarks such as OaAEP1b(C247A) and highlight its practical utility for challenging macrocyclization and bioconjugation reactions.
Beyond functional performance, our data provide mechanistic insight into how subtle structural features modulate the balance between hydrolysis and transpeptidation in plant AEPs. Using the branched reporter substrate GN14-SL, we directly quantified product partitioning and demonstrated that the I244A substitution consistently suppresses hydrolysis while enhancing backbone cyclization across the entire practical pH range. Importantly, this shift in product distribution is observed without alteration of the catalytic triad or the core protease architecture, indicating that the mutation acts by redirecting reaction outcome rather than changing the fundamental catalytic chemistry.
Homology modeling suggests a structural basis for this effect. The I244A substitution enlarges the region that accommodates the incoming N-terminal nucleophile following formation of the thioacyl enzyme intermediate, thereby reducing steric constraints that compete with productive nucleophilic attack. As a consequence, the probability that the peptide N terminus, rather than water, resolves the acyl–enzyme intermediate is increased. This model is fully consistent with our experimental observations, in which VaPAL2(I244A) exhibits both accelerated turnover and a pronounced reduction in hydrolytic by-product formation. These results support the view that reaction partitioning in AEPs can be tuned by modulating the geometric accessibility of the nucleophile entry pathway, rather than by altering catalytic residues directly.
Notably, the effects of the I244A substitution extend beyond product partitioning. VaPAL2(I244A) retains high activity under near-neutral conditions and in the presence of up to 20% DMSO, conditions under which many plant AEP ligases show reduced performance. This suggests that improved accessibility of the nucleophile to the acyl intermediate may also mitigate sensitivity to solvent composition and protonation state, thereby broadening the operational window of the enzyme. Together, these findings indicate that a single gatekeeper-adjacent residue can exert coordinated control over reaction rate, ligation bias, and environmental tolerance.
VaPAL2(I244A) therefore integrates several properties seldom observed together in recombinant AEP ligases: (i) high catalytic efficiency, (ii) strong ligation bias across multiple substrates, (iii) compatibility with near-neutral pH, and (iv) exceptional tolerance to organic cosolvents. This combination makes VaPAL2(I244A) particularly well suited for peptide macrocyclization, segmental ligation of folded proteins, and bioconjugation reactions in mixed aqueous–organic media.
More broadly, our results illustrate that minimal, targeted perturbations within naturally occurring AEP scaffolds can produce quantitative shifts in reaction outcome without compromising overall enzyme stability or expression. This provides a mechanistic framework for understanding how ligase activity can evolve within the AEP superfamily and offers a rational strategy for engineering ligation-biased enzymes from diverse plant lineages. Future studies exploring additional VaPAL paralogs and combining gatekeeper substitutions with modifications in other regions influencing substrate positioning may further refine substrate scope and selectivity, enabling predictive design of next-generation peptide ligases.
Experimental procedures
Transcriptome source and in silico mining of V. arcuata AEP homologs
RNA-seq data were not newly generated for this study. We used the published V. arcuata fruit transcriptome “Integrative transcriptome and mass spectrometry analysis reveals novel cyclotides with antimicrobial and cytotoxic activities from V. arcuata” (NCBI SRA accession PRJNA494974). The Trinity assembly of this dataset yielded ∼14.7 Gb of clean bases and 86,674 unigenes, which served as our reference transcriptome.
To identify AEP homologs with ligase-like features, amino-acid sequences of reported Viola AEPs (VbAEP1–4 from V. betonicifolia, QVD38651–QVD38654; VpAEP2–4 from V. philippic**a, QCW05331–QCW05333) were used as BLASTp queries against the six-frame translated V. arcuata transcriptome. Ligase-type references (butelase-1, OaAEP1b) were then used as supplementary queries to recover more divergent candidates. Hits with E-values ≤ 1 × 10^-5^ and a clear legumain-like domain architecture (signal/pro-region, core, cap) were retained. In total, 11 V. arcuata AEP-like sequences were retrieved and provisionally designated VaAEP1–VaAEP11. Multiple-sequence alignments were generated with MAFFT v7 (L-INS-i) and manually inspected; positions previously implicated in ligase bias—the S2 gatekeeper, LAD1/LAD2 motifs flanking S1′/S2, and cap-region PPL/MLA elements—were annotated for each sequence. Sequences carrying an Ile at the gatekeeper position together with ligase-favoring LAD1/LAD2 patterns were provisionally labeled VaPAL1–VaPAL5; the remaining sequences were designated VaAEP1–VaAEP6 and considered more protease-leaning candidates.
Phylogenetic analysis and annotation of ligase signatures
To evaluate whether the active-site–based grouping was reflected evolutionarily, we constructed a maximum-likelihood phylogeny of plant legumain (Peptidase_C13) proteins. Angiosperm C13 sequences were retrieved from UniProt and merged with functionally characterized ligase-type enzymes (butelase-1, OaAEP1b, VyPALs) and human legumain (LGMN) as an outgroup. The 11 V. arcuata candidates were translated to protein sequences and added to this set. Redundant entries were collapsed with CD-HIT at 95% identity. Family membership was confirmed with HMMER3 (hmmsearch) against PF01650; sequences lacking the conserved Asn–His–Cys catalytic triad or an evident signal–pro–core–cap architecture were excluded.
Protein sequences were aligned with MAFFT v7 (L-INS-i), and gappy/low-information sites were trimmed with ClipKit (smart-gap mode). Maximum-likelihood trees were inferred in IQ-TREE 2 with ModelFinder for best-fit model selection (43). Node support was evaluated with 1000 ultrafast bootstrap (UFBoot2) replicates and 1000 SH-aLRT replicates; only nodes with UFBoot ≥ 95 and SH-aLRT ≥ 80 were interpreted. Trees were rooted at LGMN. For each tip, the S2 gatekeeper residue and LAD1/LAD2 pattern were taken from the alignment and displayed as metadata in iTOL. V. arcuata sequences were labeled as VaPAL1–5 or VaAEP1–6 according to the fingerprint-based assignment used in the Results section. Patristic distances (sum of branch lengths between tips) were calculated in R/ape (cophenetic.phylo) for descriptive comparisons.
Plant material, RNA isolation, and cDNA synthesis
Frozen V. arcuata tissues stored at −80 °C were ground under liquid nitrogen in a prechilled mortar. Total RNA was extracted using the FastPure Universal Plant Total RNA Isolation Kit (Vazyme) according to the manufacturer’s instructions, including on-column DNase digestion. RNA quantity and purity were assessed by NanoDrop spectrophotometry and agarose gel electrophoresis. First-strand complementary DNA (cDNA) was synthesized from 1 to 2 μg of total leaf RNA using the PrimeScript II 1st Strand cDNA Synthesis Kit (Takara) with oligo(dT) primers following the supplier’s protocol. Gene-specific primers were designed from the in silico V. arcuata AEP/PAL sequences. PCR amplification was performed with Premix Taq (Takara); amplicons were analyzed by agarose gel electrophoresis, gel-purified (FastPure Gel DNA Extraction Mini Kit, Vazyme), and Sanger sequenced (Sangon) to confirm ORFs.
Cloning, site-directed mutagenesis, and plasmid construction
Coding sequences for VaPAL2 (lacking the predicted N-terminal signal peptide), its gatekeeper mutant VaPAL2(I244A), the positive ligase control OaAEP1b(C247A), and selected VaAEP2/3/4 isoforms were cloned into pET-28a(+) using NdeI/XhoI sites to produce N-terminal His_6_-tagged constructs. The I244A substitution in VaPAL2 was introduced by Q5 site-directed mutagenesis (New England Biolabs) using overlapping primers according to the manufacturer’s instructions. All constructs were verified by bidirectional Sanger sequencing.
Recombinant expression and purification in E. coli
Plasmids were transformed into E. coli SHuffle T7 cells. Overnight starter cultures were diluted 1:100 into fresh LB medium containing 50 μg mL^-1^ kanamycin and grown at 37 °C to an OD_600_ of ∼0.6. Protein expression was induced with 0.1 mM IPTG at 16 °C for 18 to 24 h. Cells from 1 L cultures were harvested by centrifugation (6000×g, 15 min, 4 °C) and resuspended in lysis buffer (20 mM Hepes, pH 7.4, 300 mM NaCl, 10 mM imidazole, 1 mM tris(2-carboxyethyl)phosphine (TCEP)). Cells were lysed by sonication on ice and the lysate clarified by centrifugation (20,000×g, 30 min, 4 °C).
Clarified supernatants were applied to Ni^2+^-NTA resin, washed with lysis buffer containing 20 to 30 mM imidazole, and eluted with 250 to 300 mM imidazole. Eluates were desalted and further purified by size-exclusion chromatography (Superdex 200 Increase 10/300 GL; Cytiva) into storage buffer (20 mM Hepes, pH 7.4, 150 mM NaCl, 0.5–1 mM TCEP). Purified zymogens were analyzed by SDS–PAGE. Typical culture-normalized yields reported here were ∼0.37 mg L^-1^ for VaPAL2, ∼0.74 mg L^-1^ for VaPAL2(I244A), and ∼1.4 mg L^-1^ for OaAEP1b(C247A); VaAEP2/3/4 were obtained at lower yields but as single bands.
Acid activation of VaPAL2 and related constructs
Purified His_6_-tagged zymogens (0.5–2.0 mg mL^-1^) were diluted 10-fold into activation buffers consisting of 50 mM sodium citrate (pH 2.0, 3.0, 4.0, 5.0, or 6.0) or 50 mM sodium phosphate (pH 7.0), each containing 150 mM NaCl and 1 mM TCEP. Samples were incubated for 12 to 16 h at 4 °C. To determine the optimal activation pH, VaPAL2, VaPAL2(I244A), and the benchmark OaAEP1b(C247A) were processed in parallel across the full pH 2.0 to 7.0 gradient. The three protease-leaning isoforms VaAEP2, VaAEP3, and VaAEP4 were tested under identical conditions. Activation was monitored by reducing SDS–PAGE (10% gel) with Coomassie staining.
Unless otherwise indicated, VaPAL2, VaPAL2(I244A), and OaAEP1b(C247A) used in subsequent assays were activated at pH 3.0 to 3.5 for 12 to 16 h at 4 °C. The activated enzymes were then buffer-exchanged and concentrated into 50 mM Hepes–NaOH pH 7.0, 150 mM NaCl, 1 mM TCEP using Amicon Ultra-0.5 centrifugal filter units with a 10 kDa molecular weight cutoff (Millipore)
Peptide synthesis
All peptide substrates (GN4-SL, GN5-SL, GN10-SL, GD10-SL, GN14-SL, and cyclotide-derived sequences such as cyO14, Viar-A, Mra-30, MCoTI-II) were prepared by standard Fmoc/tBu solid-phase peptide synthesis on Rink amide resin (GL Biochem) using a Liberty Blue automated microwave synthesizer (CEM). Fmoc deprotection was carried out with 20% piperidine in dimethylformamide. Couplings were performed with DIC/Oxyma in dimethylformamide; difficult couplings were repeated. Global cleavage and side-chain deprotection were achieved with TFA/TIS/H_2_O/DODT (92.5:2.5:2.5:2.5, v/v/v/v) or 95% TFA/2.5% H_2_O/2.5% TIS at 40 °C. Crude peptides were precipitated with cold diethyl ether, dried, purified by preparative RP-HPLC, and verified by ESI–MS or MALDI-TOF. Purified peptides were lyophilized and stored at −20 °C.
Standard ligation/macrocyclization assays and UPLC analysis
Routine reactions were carried out in 0.1 M sodium acetate, 50 mM NaCl, 1 mM EDTA, pH 6.5, in a total volume of 50 to 200 μl. Unless otherwise indicated, peptide substrate was 500 μM and activated enzyme (VaPAL2 or VaPAL2(I244A)) was 250 nM. Reactions were incubated at 37 °C for the specified time and quenched with 0.1 to 0.2 vol of 2% TFA to pH < 2. Quenched samples were centrifuged briefly and analyzed by UPLC/RP-HPLC on a C18 column with a water/acetonitrile gradient containing 0.05 to 0.1% TFA and 214 nm detection. Substrate, cyclic product, and, when present, hydrolyzed product were assigned by parallel ESI–MS.
Because 214 nm absorbance is roughly proportional to the number of peptide bonds, integrated peak areas (Aᵢ) were normalized to the number of peptide bonds in each species (nᵢ) before yield calculations. For single-product substrates such as GN10-SL:
where A_i_ is the integrated 214 nm peak area and n_i_ is the number of peptide bonds in species i
For substrates like GN14-SL that give three observable species at the endpoint—cyclic product C, hydrolyzed product H, and unreacted substrate S—we define the normalized total as
Then:
This makes GN14 data directly comparable between WT and I244A even when total conversion is different.
This allowed direct comparison of VaPAL2 and VaPAL2(I244A) at the same reaction time even when total conversion differed.
pH profiling and co-solvent tolerance
For pH profiling, reactions were set up as above in buffers spanning pH 5.0 to 8.0 (acetate, MES, Hepes, or Tris as appropriate) and run to the fixed endpoint (reaction plateau) used in Figure 4. Yields were calculated from normalized UPLC peak areas.
For cosolvent tolerance, GN10-SL (500 μM) was ligated by VaPAL2, VaPAL2(I244A) or OaAEP1b(C247A) in buffer containing 20% (v/v) methanol, 25% (v/v) ethanol, 25% (v/v) isopropanol, or 20% (v/v) DMSO, with reactions initiated by enzyme addition. Reactions were run under the same temperature and time as the buffer-only control, with no pre-incubation of enzyme or substrate in organic solvents prior to reaction initiation. Residual activity was expressed as endpoint cyclization yield in cosolvent divided by the yield in buffer (%) and plotted as in Figure 7.
To assess whether prolonged exposure to DMSO affects enzyme activity through oxidative modification, a controlled pre-incubation experiment was performed. GN10-SL (500 μM) was incubated in reaction buffer containing 20% (v/v) DMSO for 0, 30, or 60 min at the assay temperature prior to reaction initiation. Macrocyclization reactions were initiated simultaneously for all conditions by the addition of VaPAL2, VaPAL2(I244A), or OaAEP1b(C247A) and allowed to proceed for an identical incubation time. Reactions were quenched and analyzed by UPLC, and product identities were confirmed by ESI–MS as described above.
Kinetic analysis
For Michaelis–Menten kinetics with GN10-SL, reactions were performed at pH 6.5 to 7.0 as above, with substrate concentrations typically between 50 and 1000 μM. Small aliquots were withdrawn at short time intervals (20–30 s), quenched with TFA, and analyzed by UPLC. Initial rates (v_0_) were obtained from the linear portion of product formation and fitted to the Michaelis–Menten equation in GraphPad Prism to obtain Km, Vmax, and kcat (kcat = Vmax/[E]). Enzyme concentrations used in the assay are reported in Figure 4B.
Structural modeling, superposition, and corridor measurements
Structural models of VaPAL2, VaPAL2(I244A), and OaAEP1b(C247A) were generated using AlphaFold3 with default parameters. Models were inspected and processed in PyMOL 2.x. Enzyme–substrate docking for the VaPAL2–GIPNSL and VaPAL2(I244A)–GIPNSL complexes was performed using HADDOCK 2.4 with active residues defined around the catalytic triad and the P1 Asn; top-scoring poses were selected based on HADDOCK score and geometric compatibility with a ligation-competent pose (44, 45). Detailed HADDOCK cluster statistics and energy terms are provided in Table S5.
Core domains were superposed using PyMOL’s align command (typical r.m.s.d. < 0.5 Å) and refined with pair_fit to verify catalytic-triad conservation. The distance between the catalytic Cys Sγ and the P1 Asn carbonyl carbon was measured in the best substrate-docked pose. Local S2 cavities were extracted with PyVOL (v1.0) using consistent parameters (probe radius 1.4 Å; minimum pocket volume 70 Å^3^). To isolate the S1′→S2 corridor, structures were locally cropped around residues 121 to 123 and residue 244 (gatekeeper), and the cavity centroid nearest the gatekeeper was retained. The minimum corridor width (W_min) was defined as the shortest distance between gatekeeper side-chain heavy atoms (Ile/Ala244) and the S1′ wall represented by the Cα atoms of residues 121 to 123; measurements were performed consistently across models.
Data availability
All data are available in the article and Supporting Information. The original V. arcuata RNA-seq dataset is accessible at NCBI SRA under accession PRJNA494974.
Supporting information
This article contains supporting information.
Conflict of interest
The authors declare that they have no conflicts of interest with the contents of this article.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Arnison P.G.Bibb M.J.Bierbaum G.Bowers A.A.Bugni T.S.Bulaj G.Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature Nat. Prod. Rep.3020131081602316592810.1039/c 2np 20085 f PMC 3954855 · doi ↗ · pubmed ↗
- 2Harris K.S.Durek T.Kaas Q.Poth A.G.Gilding E.K.Conlan B.F.Efficient backbone cyclization of linear peptides by a recombinant asparaginyl endopeptidase Nat. Commun.620151019910.1038/ncomms 10199 PMC 470385926680698 · doi ↗ · pubmed ↗
- 3Gruber C.W.Elliott A.G.Ireland D.C.Delprete P.G.Dessein S.Göransson U.Distribution and evolution of circular miniproteins in flowering plants Plant Cell 202008247124831882718010.1105/tpc.108.062331 PMC 2570719 · doi ↗ · pubmed ↗
- 4Poth A.G.Colgrave M.L.Philip R.Kerenga B.Daly N.L.Anderson M.A.Discovery of cyclotides in the fabaceae plant family provides new insights into the cyclization, evolution, and distribution of circular proteins ACS Chem. Biol.620113453552119424110.1021/cb 100388 j · doi ↗ · pubmed ↗
- 5de Veer S.J.Kan M.W.Craik D.J.Cyclotides: from structure to function Chem. Rev.119201912375124213182901310.1021/acs.chemrev.9b 00402 · doi ↗ · pubmed ↗
- 6Montalbán-López M.Scott T.A.Ramesh S.Rahman I.R.van Heel A.J.Viel J.H.New developments in Ri PP discovery, enzymology and engineering Nat. Prod. Rep.3820211302393293569310.1039/d 0np 00027 b PMC 7864896 · doi ↗ · pubmed ↗
- 7Daly N.L.Wilson D.T.Plant derived cyclic peptides Biochem. Soc. Trans.492021127912853415640010.1042/BST 20200881 PMC 8286818 · doi ↗ · pubmed ↗
- 8Craik D.J.Mylne J.S.Daly N.L.Cyclotides: macrocyclic peptides with applications in drug design and agriculture Cell. Mol. Life. Sci.6720109161979518810.1007/s 00018-009-0159-3PMC 11115554 · doi ↗ · pubmed ↗
