# RGBChem: Image-Like Representation of Chemical Compounds for Property Prediction

**Authors:** Rafał Stottko, Radosław Michalski, Bartłomiej M. Szyja

PMC · DOI: 10.1021/acs.jctc.5c00291 · Journal of Chemical Theory and Computation · 2025-05-12

## TL;DR

This paper introduces RGBChem, a method to convert chemical compounds into images for machine learning predictions, improving accuracy with limited data.

## Contribution

The novel RGBChem method generates multiple images per molecule to enhance model training with small datasets.

## Key findings

- RGBChem improves model accuracy for predicting HOMO–LUMO gaps using CNNs.
- Multiple images from a single molecule expand training data and boost performance.
- The approach is effective for small datasets in chemical property prediction.

## Abstract

In this work, we introduce RGBChem, a novel approach
for converting
chemical compounds into image representations, which are subsequently
used to train a convolutional neural network (CNN) to predict the
HOMO–LUMO gap for compounds from the QM9 database. By modifying
the arbitrary order of atoms present in .xyz files used to generate
these images, it has been demonstrated that expanding the initial
training set size can be achieved by creating multiple unique images
(data points) from a single molecule. This study shows that the presented
approach leads to a statistically significant improvement in model
accuracy, highlighting RGBChem as a powerful approach for leveraging
machine learning (ML) in scenarios where the available data set is
too small to apply ML methods effectively.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12120917/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12120917/full.md

## References

56 references — full list in the complete paper: https://tomesphere.com/paper/PMC12120917/full.md

---
Source: https://tomesphere.com/paper/PMC12120917