Structural genomics of bacterial drug targets: Application of a high-throughput pipeline to solve 58 protein structures from pathogenic and related bacteria
Nicole L. Inniss, George Minasov, Changsoo Chang, Kemin Tan, Youngchang Kim, Natalia Maltseva, Peter Stogios, Ekaterina Filippova, Karolina Michalska, Jerzy Osipiuk, Lukasz Jaroszewki, Adam Godzik, Alexei Savchenko, Andrzej Joachimiak, Wayne F. Anderson, Karla J. F. Satchell

TL;DR
This paper describes solving 58 protein structures from bacteria to help develop better antibiotics.
Contribution
The novel contribution is the high-throughput pipeline used to solve 58 new bacterial protein structures.
Findings
58 X-ray crystal structures of bacterial proteins were deposited.
These structures are known antibiotic targets and reveal structural variation.
The work supports future antibiotic discovery and modifications.
Abstract
Antibiotic resistance remains a leading cause of severe infections worldwide. Small changes in protein sequence can impact antibiotic efficacy. Here, we report deposition of 58 X-ray crystal structures of bacterial proteins that are known targets for antibiotics, which expands knowledge of structural variation to support future antibiotic discovery or modifications.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Fig 1| PDB code | Csbid | Protein name | Organism | Resolution | Ligand |
|---|---|---|---|---|---|
| 6nbk | IDP07367 | Arginase |
| 1.91 Å | – |
| 6nfp | IDP07164 | Arginase |
| 1.70 Å | – |
| 5us8 | IDP07200 | Argininosuccinate synthase |
| 2.15 Å | Adenosine |
| 6e5y | iDP07200 | Argininosuccinate synthase |
| 1.50 Å | AMP |
| 6w2z | IDP07475 | Beta-lactamase class A |
| 1.50 Å | Avibactam |
| 9bzn | IDP07519 | Beta-lactamase class A |
| 1.05 Å | – |
| 9bzq | IDP07519 | Beta-lactamase class A |
| 1.47 Å | Avibactam |
| 9bzr | IDP07519 | Beta-lactamase class A |
| 1.40 Å | Clavulanate |
| 6pua | IDP07511 | Chloramphenicol acetyltransferase |
| 2.00 Å | – |
| 5ux9 | IDP07301 | Chloramphenicol acetyltransferase |
| 2.70 Å | – |
| 6pxa | IDP07301 | Chloramphenicol acetyltransferase |
| 1.82 Å | Taurocholic acid |
| 6pu9 | IDP07511 | Chloramphenicol acetyltransferase |
| 1.70 Å | – |
| 6b5f | IDP07570 | CobT |
| 1.95 Å | – |
| 6azi | IDP07508 | D-ala-D-ala-endopeptidase |
| 1.75 Å | – |
| 6bz0 | IDP07418 | Dihydrolipoamide dehydrogenase |
| 1.83 Å | FAD |
| 6aon | IDP07182 | Dihydrolipoamide dehydrogenase |
| 1.72 Å | FAD |
| 6cmz | IDP07673 | Dihydrolipoamide dehydrogenase |
| 2.30 Å | FAD, NAD |
| 6awa | IDP07540 | Dihydrolipoamide dehydrogenase | 1.83 Å | FAD, AMP | |
| 5tr3 | IDP07540 | Dihydrolipoamide dehydrogenase |
| 2.50 Å | FAD |
| 5umg | IDP07170 | Dihydropteroate synthase |
| 2.60 Å | – |
| 5usw | IDP07359 | Dihydropteroate synthase |
| 1.64 Å | – |
| 6bq9 | IDP07285 | DNA Topoisomerase IV Subunit A |
| 2.55 Å | – |
| –5vh6 | IDP07716 | Elongation factor G |
| 2.61 Å | – |
| 6bk7 | IDP07555 | Elongation factor G |
| 1.83 Å | – |
| 6b8d | IDP07537 | Elongation factor G |
| 1.78 Å | – |
| 5ty0 | IDP07381 | Elongation factor G |
| 2.22 Å | – |
| 6n0i | IDP07336 | Elongation factor G |
| 2.60 Å | – |
| 5tv2 | IDP07581 | Elongation factor G |
| 1.60 Å | – |
| 6b4o | IDP07317 | Glutathione reductase |
| 1.73 Å | FAD |
| 5v36 | IDP07311 | Glutathione reductase |
| 1.88 Å | FAD |
| 6n7f | IDP07597 | Glutathione reductase |
| 1.90 Å | – |
| 5u1o | IDP07224 | Glutathione reductase |
| 2.31 Å | FAD |
| 5vdn | IDP07394 | Glutathione reductase |
| 1.55 Å | FAD |
| 6aoo | IDP07201 | Malate dehydrogenase |
| 2.15 Å | – |
| 6bal | IDP07201 | Malate dehydrogenase |
| 2.10 Å | L-malate |
| 5vfb | IDP07567 | Malate synthase G | 1.36 Å | Glycolytic acid | |
| 5ume | IDP07318 | MetF |
| 2.70 Å | FAD |
| 6po4 | IDP07178 | Methylthioadenosine/SAH nucleosidase |
| 2.10 Å | – |
| 5ue1 | IDP07462 | Methylthioadenosine/SAH nucleosidase |
| 1.14 Å | Adenine |
| 6muq | IDP07205 | Murein-DD-endopeptidase |
| 1.67 Å | – |
| 6c8q | IDP07348 | NAD synthetase |
| 2.58 Å | NAD |
| 5wp0 | IDP07110 | NAD synthetase |
| 2.60 Å | – |
| 5uu6 | IDP07628 | Nitroreductase A |
| 1.95 Å | FMN |
| 6czp | IDP07377 | Nitroreductase A |
| 2.24 Å | FMN |
| 6dll | IDP07306 | p-Hydroxybenzoate Hydroxylase |
| 2.20 Å | FAD |
| 5u2g | IDP07344 | Penicillin-binding protein 1A |
| 2.61 Å | – |
| 5u47 | IDP07211 | Penicillin-binding protein 2X |
| 1.95 Å | – |
| 6blb | IDP07228 | RuvB |
| 1.88 Å | ADP |
| 5u63 | IDP07488 | Thioredoxin reductase |
| 1.99 Å | – |
| 5uwy | IDP07356 | Thioredoxin reductase |
| 2.72 Å | FAD |
| 5utx | IDP07222 | Thioredoxin reductase |
| 2.46 Å | – |
| 5usx | IDP07222 | Thioredoxin reductase |
| 2.60 Å | NADP, FAD |
| 5vt3 | IDP07222 | Thioredoxin reductase |
| 1.98 Å | NADP, FAD |
| 5v0i | IDP07325 | Tryptophanyl-tRNA synthetase |
| 1.90 Å | Tryptophan, AMP |
| 6dfu | IDP07216 | Tryptophanyl-tRNA synthetase |
| 2.05 Å | – |
| 6cn1 | IDP07215 | UDP-GlcNAc 1-carboxyvinyltransferase |
| 2.75 Å | UDP-GlcNAc |
| 6nkj | IDP07236 | UDP-GlcNAc 1-carboxyvinyltransferase |
| 1.30 Å | – |
| 5wi5 | IDP07236 | UDP-GlcNAc 1-carboxyvinyltransferase |
| 2.00 Å | UDP-GlcNAc |
- —National Institute of Allergy and Infectious Diseaseshttp://dx.doi.org/10.13039/100000060
- —U.S. Department of Energyhttp://dx.doi.org/10.13039/100000015
- —Michigan Economic Development Corporationhttp://dx.doi.org/10.13039/100004948
- —Michigan Technology Tri-Corridorhttp://dx.doi.org/10.13039/100016937
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEnzyme Structure and Function · Genomics and Phylogenetic Studies · RNA and protein synthesis mechanisms
ANNOUNCEMENT
Antibiotic-resistant bacteria remain a global threat, with millions of deaths attributed to decreased drug efficacy (1, 2). Amino acid variation across different bacterial species can impact antimicrobials targeting essential biochemical pathways. To support antimicrobial discovery or chemical modification of current antibiotics, the Center for Structural Genomics of Infectious Diseases (now the Center for Structural Biology of Infectious Diseases [CSBID]) established a high-throughput (HTP) structural genomics pipeline to expand the diversity of structures available for proteins that are known drug targets. A list of proteins representing known antibiotic targets was curated using DrugBank (http://www.drugbank.ca/). The protein sequences were used as queries to identify homologs in bacterial species with genomic DNA available in the center repository. Proteins sharing at least 50% sequence identity across 75% of the protein sequence were selected. In total, 630 targets from 47 bacterial species entered the pipeline.
All targets were subjected to automated analyses supporting protein expression construct design. The genes encoding the selected proteins or protein domains were amplified by PCR using genomic DNA as a template. The PCR products were cloned into pMCSG53 (PSI:Biology-Materials Repository, http://psimr.asu.edu) according to published ligation-independent cloning procedures (3, 4). This vector introduced a protease-cleavable, N-terminal hexa-histidine purification tag. The clones were transformed into T7-polymerase expressing Escherichia coli strains and tested for expression and solubility. Soluble proteins were purified by nickel affinity chromatography according to published protocols (5, 6), and concentrated proteins were set up as 2-µL crystallization drops in 96-well plates using multiple screens. Resulting crystals were cryoprotected, cooled, and then screened for data collection at the Advanced Photon Source (APS) at Argonne National Laboratory.
In total, 24% of targets were purified, and 19% yielded protein preparations that entered HTP crystallization screens. Pipeline success rate from selection through structure determination was 7.6%. Forty-eight targets from 24 bacterial species produced high-quality crystals, yielding 58 structures (Fig. 1). The RCSB Protein Data Bank (PDB) deposition code, protein name, source DNA, and refinement statistics are listed in Table 1. Of the 58 structures determined, 55 are reported here for the first time, with three structures published previously (7, 8). Structures were derived from proteins involved in antibiotic modification, cell wall maintenance, oxidative stress, and metabolism.
Percentage of approved bacterial drug targets at each stage in the structure determination pipeline and representative X-ray structures. The pie chart shows the overall success rate of proteins in the structure determination pipeline from a total of 630 targets. Work was completed between 2016 and 2024. Twenty-five representative structures are depicted as cartoons: β-sheets are colored yellow, α-helices are teal, and loops are gray. Associated crystal variants, complexes with ligands, and homologous structures are annotated below each image, totaling 58 structures solved. The proteins were sorted according to their known function in bacteria. Associated ligands and crystallographic details are described in Table 1.
Data collection and data quality information are available on the PDB. Structures of proteins grown in selenomethionine medium were solved by single-wavelength anomalous diffraction method, using the Automatic Structure Solution from HKL-3000 (9) and Auto-build package from PHENIX (10). Structures of native proteins were solved by molecular replacement in the CCP4 suite (11). Diffraction data were used for structure solution using either the structure of the closest sequence homolog in the PDB in PHASER or the target protein sequence using MORDA and MRBUMP. Structures were refined using REFMAC5 (12) or PHENIX and visually corrected in Coot (13). Water molecules were generated using ARP/wARP (14), and ligands were fit into electron density maps in Coot. Translation–Libration–Screw groups were generated by the TLSMD server (15), and corrections were applied during refinement finalization. Models were validated using MolProbity (16).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Antimicrobial Resistance Collaborators. 2022. Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet 399:629–655. doi:10.1016/S 0140-6736(21)02724-035065702 PMC 8841637 · doi ↗ · pubmed ↗
- 2CDC. 2019. Antibiotic resistance threats in the united states. Department of Health and Human Services, CDC, Atlanta, GA.
- 3Eschenfeldt WH, Lucy S, Millard CS, Joachimiak A, Mark ID. 2009. A family of LIC vectors for high-throughput cloning and purification of proteins. Methods Mol Biol 498:105–115. doi:10.1007/978-1-59745-196-3_718988021 PMC 2771622 · doi ↗ · pubmed ↗
- 4Stols L, Gu M, Dieckman L, Raffen R, Collart FR, Donnelly MI. 2002. A new vector for high-throughput, ligation-independent cloning encoding a tobacco etch virus protease cleavage site. Protein Expr Purif 25:8–15. doi:10.1006/prep.2001.160312071693 · doi ↗ · pubmed ↗
- 5Shuvalova L. 2014. Parallel protein purification. Methods Mol Biol 1140:137–143. doi:10.1007/978-1-4939-0354-2_1024590714 · doi ↗ · pubmed ↗
- 6Makowska-Grzyska M, Kim Y, Maltseva N, Li H, Zhou M, Joachimiak G, Babnigg G, Joachimiak A. 2014. Protein production for structural genomics using E. coli expression. Methods Mol Biol 1140:89–105. doi:10.1007/978-1-4939-0354-2_724590711 PMC 4108990 · doi ↗ · pubmed ↗
- 7Lazar JT, Shuvalova L, Rosas-Lemus M, Kiryukhina O, Satchell KJF, Minasov G. 2019. Structural comparison of p-hydroxybenzoate hydroxylase (Pob A) from Pseudomonas putida with Pob A from other Pseudomonas spp. and other monooxygenases. Acta Crystallogr F Struct Biol Commun 75:507–514. doi:10.1107/S 2053230 X 1900865331282871 PMC 6613441 · doi ↗ · pubmed ↗
- 8Alcala A, Ramirez G, Solis A, Kim Y, Tan K, Luna O, Nguyen K, Vazquez D, Ward M, Zhou M, Mulligan R, Maltseva N, Kuhn ML. 2020. Structural and functional characterization of three Type B and C chloramphenicol acetyltransferases from Vibrio species. Protein Sci 29:695–710. doi:10.1002/pro.379331762145 PMC 7020993 · doi ↗ · pubmed ↗
