# Combined substituent number utilized machine learning for the development of antimicrobial agent

**Authors:** Keitaro Yamauchi, Hirotaka Nakatsuji, Takaaki Kamishima, Yoshitaka Koseki, Masaki Kubo, Hitoshi Kasai

PMC · DOI: 10.1038/s41598-024-53888-2 · Scientific Reports · 2024-02-19

## TL;DR

This study uses machine learning with a simplified molecular descriptor to develop and screen antimicrobial agents efficiently.

## Contribution

A novel descriptor called combined substituent number (CSN) is introduced for efficient antimicrobial agent development.

## Key findings

- The CSN-based model achieved a coefficient of determination of 0.719 for training data and 0.519 for validation data.
- The CSN enabled the creation of a large library of 32 million potential antimicrobial compounds for screening.
- Model-suggested molecules showed validity through E. coli growth inhibition experiments.

## Abstract

The utilization of machine learning has a potential to improve the environment of the development of antimicrobial agents. For practical use of machine learning, it is important that the conversion of molecules information to an appropriate descriptor because too informative descriptor requires enormous computation time and experiments for gathering data, whereas a less informative descriptor has problems in validity. In this study, we utilized a descriptor only focused on substituent. The type and the position of substituents on the molecules that have a 4-quinolone structure (11,879 compounds) were converted to the combined substituent number (CSN). While the CSN does not include information on the detailed structure, physical properties, and quantum chemistry of molecules, the prediction model constructed by machine learning of CSN indicated a sufficient coefficient of determination (0.719 for the training dataset and 0.519 for the validation dataset). In addition, this CSN can easily construct the unknown molecules library which has a relatively consistent structure by recombination of substituents (32,079,318 compounds) and screening of them. The validity of the prediction model was also confirmed by growth inhibition experiments for E. coli using the model-suggested molecules and commercially available antimicrobial agents.

## Linked entities

- **Chemicals:** 4-quinolone (PubChem CID 69141)

## Full-text entities

- **Chemicals:** 4-quinolone (MESH:D042462)
- **Species:** Escherichia coli (E. coli, species) [taxon 562]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC10876936/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC10876936/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/PMC10876936/full.md

---
Source: https://tomesphere.com/paper/PMC10876936