# ChemReco: automated recognition of hand-drawn carbon–hydrogen–oxygen structures using deep learning

**Authors:** Hengjie Ouyang, Wei Liu, Jiajun Tao, Yanghong Luo, Wanjia Zhang, Jiayu Zhou, Shuqi Geng, Chengpeng Zhang

PMC · DOI: 10.1038/s41598-024-67496-7 · Scientific Reports · 2024-07-25

## TL;DR

ChemReco is a deep learning tool that converts hand-drawn chemical structures into machine-readable formats, improving research and education efficiency.

## Contribution

A novel synthetic image method for dataset generation and an EfficientNet+Transformer model achieving 96.90% accuracy in structure recognition.

## Key findings

- A synthetic image method was developed to efficiently generate hand-drawn chemical structure datasets.
- The proposed model achieved 96.90% recognition accuracy using an EfficientNet+Transformer architecture.
- The model outperformed other encoder-decoder combinations in structure recognition tasks.

## Abstract

Chemical molecular structures are a direct and convenient means of expressing chemical knowledge, playing a vital role in academic communication. In chemistry, hand drawing is a common task for students and researchers. If we can convert hand-drawn chemical molecular structures into machine-readable formats, like SMILES encoding, computers can efficiently process and analyze these structures, significantly enhancing the efficiency of chemical research. Furthermore, with the progress of educational technology, automated grading is gaining popularity. When machines automatically recognize chemical molecular structures and assess the correctness of the drawings, it offers great convenience to teachers. We created ChemReco, a tool designed to identify chemical molecular structures involving three atoms: C, H, and O, providing convenience for chemical researchers. Currently, there are limited studies on hand-drawn chemical molecular structures. Therefore, the primary focus of this paper is constructing datasets. We propose a synthetic image method to rapidly generate images resembling hand-drawn chemical molecular structures, enhancing dataset acquisition efficiency. Regarding model selection, the hand-drawn chemical molecule structural recognition model developed in this article achieves a final recognition accuracy of 96.90%. This model employs the encoder-decoder architecture of EfficientNet + Transformer, demonstrating superior performance compared to other encoder-decoder combinations.

## Full-text entities

- **Chemicals:** H (MESH:D006859), O (MESH:D010100), C (MESH:D002244)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11272916/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11272916/full.md

## References

12 references — full list in the complete paper: https://tomesphere.com/paper/PMC11272916/full.md

---
Source: https://tomesphere.com/paper/PMC11272916