# Machine-Learning Ice Spectra: From 1 to 256 Features

**Authors:** Shokirbek Shermukhamedov, Jolla Kullgren, Daniel Sethio, Kersti Hermansson

PMC · DOI: 10.1021/acs.jctc.5c01413 · Journal of Chemical Theory and Computation · 2026-02-04

## TL;DR

This paper investigates how machine learning models can predict ice's spectroscopic properties using different feature sets and finds that complex models like MACE offer the highest accuracy.

## Contribution

The study compares machine learning models and descriptors for predicting ice spectra, highlighting trade-offs between accuracy and simplicity.

## Key findings

- The MACE model achieved the highest accuracy with RMSD of 0.06 ppm for chemical shifts and ~10 cm–1 for vibrational frequencies.
- Simpler descriptors like ACSF and SOAP, when paired with suitable regressors, nearly matched MACE's performance.
- Using a single H-bond distance as a descriptor resulted in significantly higher RMSD values compared to MACE.

## Abstract

The study explores
how well machine learning and structural fingerprints
can predict spectroscopic properties of ice (OH vibrational frequencies
and 1H chemical shifts). A large theoretical data set (55
ice polymorphs, 1010 DFT data points both for the vibrations and for
the NMR shifts) and a smaller cross-validation set are employed. The
Message Passing Atomic Cluster Expansion (MACE) model performs the
best, with high accuracy (root-mean-square deviation, RMSD, of 0.06
ppm for chemical shifts and ∼10 cm–1 for
vibrational frequencies). Simpler descriptors like ACSF and SOAP,
when paired with suitable regressors, nearly match MACE’s performance.
At the other end of the complexity scale, it is found that using the
simplest possible physics-based descriptor of the environment (a single
H-bond distance) yields RMSD values three times as large for the vibrations
and four times as large for the proton chemical shift compared to
the MACE model. Depending on the context, those RMSD values may still
be considered modest and useful, considering the gain in simplicity
and transparency.

## Full-text entities

- **Chemicals:** Ice (MESH:D007053), 1H (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12937092/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12937092/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/PMC12937092/full.md

---
Source: https://tomesphere.com/paper/PMC12937092