# Machine learning-driving optimization and spatial assembly of a cell-free system for high-yield liquiritigenin production

**Authors:** Fei Liu, Si-Bo Zhao, Yan-Hua Liu, Jun-Feng Li, Nuo-Qiao Lin, Meihereayi Mutailifu, Pei Xu, Jian-Zhong Liu

PMC · DOI: 10.1007/s44307-026-00103-0 · 2026-03-27

## TL;DR

Researchers created a cell-free system using machine learning and enzyme assembly to efficiently produce the medicinal compound liquiritigenin.

## Contribution

A novel cell-free system combining machine learning and spatial enzyme assembly for high-yield biosynthesis of liquiritigenin.

## Key findings

- Optimized enzyme combinations achieved 155.32 ± 14.39 mg/L liquiritigenin production.
- Spatial assembly with scaffold proteins increased yield to 439.42 ± 19.53 mg/L.
- Machine learning and iterative experiments improved enzyme ratios and cofactor concentrations.

## Abstract

Liquiritigenin is a medicinal flavonoid whose production is constrained by inefficient plant extraction and complex chemical synthesis. To overcome this, we developed a modular cell-free multi-enzyme system for its efficient biosynthesis from tyrosine, integrating spatial enzyme assembly with machine learning-guided optimization. Using a combined cell-free metabolic engineering (CFME) and cell-free protein synthesis-driven metabolic engineering (CFPS-ME) approach, we screened and optimized five key pathway enzymes to establish a one-pot reaction. The optimal enzyme combination (phenylalanine ammonia-lyase from Zea mays, 4-coumarate-coenzyme A ligase 4 from Arabidopsis thaliana, chalcone synthase from Glycine max, chalcone reductase from Medicago sativa, chalcone flavonone isomerase from Zea mays) was identified through systematic screening and ratio optimization. After Plackett–Burman and steepest-ascent experiments, three rounds of iterative machine learning fine-tuned key parameters, including enzyme ratios and cofactor concentrations, yielding 155.32 ± 14.39 mg/L. Spatial enzyme assembly was further enhanced via covalent peptide tags and scaffold proteins (γPFD-SpyCatcher) under CFME. Combining CFPS-ME with scaffold-assisted co-immobilization significantly boosted production, reaching a final titer of 439.42 ± 19.53 mg/L. This study demonstrates that machine learning-driven optimization and spatial assembly of multienzyme complexes is a powerful approach for cell-free biosynthesis.

The online version contains supplementary material available at 10.1007/s44307-026-00103-0.

## Linked entities

- **Chemicals:** liquiritigenin (PubChem CID 1889), tyrosine (PubChem CID 1153)
- **Species:** Zea mays (taxon 4577), Arabidopsis thaliana (taxon 3702), Glycine max (taxon 3847), Medicago sativa (taxon 3879)

## Full-text entities

- **Genes:** phenylalanine ammonia-lyase [NCBI Gene 100285115], chalcone flavonone isomerase [NCBI Gene 100284018], chalcone synthase [NCBI Gene 100283134]
- **Diseases:** inflammatory (MESH:D007249), CFPS (MESH:D002292)
- **Chemicals:** CuCl2 (MESH:C029892), ice (MESH:D007053), potassium glutamate (MESH:D018698), D-ribose (MESH:D012266), GTP (MESH:D006160), naringenin (MESH:C005273), amino acids (MESH:D000596), NiCl2 (MESH:C022838), ethanolamine (MESH:D019856), isoliquiritigenin (MESH:C040920), Liquiritigenin (MESH:C083152), acetonitrile (MESH:C032159), limonene (MESH:D000077222), CoA (MESH:D003065), Na2SeO3 (MESH:D018038), K2HPO4 (MESH:C013216), styrene (MESH:D020058), NAD (MESH:D009243), p-coumaric acid (MESH:C495469), phosphoenolpyruvate (MESH:D010728), NH4Cl (MESH:D000643), FeCl3 (MESH:C024555), L-tyrosine (MESH:D014443), PEG8000 (MESH:C000595216), N- (MESH:D009584), glucose (MESH:D005947), ammonium formate (MESH:C030544), n-butanol (MESH:D020001), acetate (MESH:D000085), CoCl2 (MESH:C018021), HCl (MESH:D006851), flavonoid (MESH:D005419), putrescine (MESH:D011700), UTP (MESH:D014544), ATP (MESH:D000255), polyphenol (MESH:D059808), metal (MESH:D008670), Naringenin Chalcone (MESH:C027329), -C (MESH:D002244), Na2MoO4 (MESH:C024687), IPTG (MESH:D007544), maltodextrin (MESH:C008315), magnesium acetate (MESH:C000656591), CTP (MESH:D003570), cinnamyl alcohol (MESH:C020722), CFPS (-), MgSO4 (MESH:D008278), glycerol (MESH:D005990), CaCl2 (MESH:D002122), ZnSO4 (MESH:D019287), MnCl2 (MESH:C025340), Na2SO4 (MESH:C012036), galactose (MESH:D005690), HEPES (MESH:D006531), isoflavone (MESH:D007529), NADPH (MESH:D009249), folinic acid (MESH:D002955), spermidine (MESH:D013095), SDS (MESH:D012967), CH3COOK (MESH:D019347)
- **Species:** Escherichia phage MS2 (no rank) [taxon 12022], Lederbergvirus P22 (species) [taxon 10754], Salmonella enterica (species) [taxon 28901], Escherichia coli (E. coli, species) [taxon 562], Hepatitis B virus (no rank) [taxon 10407], Saccharomyces cerevisiae (baker's yeast, species) [taxon 4932], Arabidopsis thaliana (mouse-ear cress, species) [taxon 3702], Glycine max (soybean, species) [taxon 3847], Medicago sativa (alfalfa, species) [taxon 3879], Yarrowia lipolytica (species) [taxon 4952], Zea mays (maize, species) [taxon 4577]
- **Cell lines:** S2 — Drosophila melanogaster (Fruit fly), Spontaneously immortalized cell line (CVCL_Z232)

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13031608/full.md

---
Source: https://tomesphere.com/paper/PMC13031608