# Validation of simulated training sets using a convolutional neural network for isotope identification in urban environments

**Authors:** Luke Lee-Brewin, Ryan Holden, Caroline Shenton-Taylor, Cebastien Joel Guembou Shouop, Cebastien Joel Guembou Shouop, Cebastien Joel Guembou Shouop

PMC · DOI: 10.1371/journal.pone.0323053 · 2025-06-09

## TL;DR

This paper validates a method to simulate gamma spectra for isotope identification in urban settings using a convolutional neural network, achieving high accuracy.

## Contribution

A validated method for generating simulated gamma spectra datasets for isotope identification in urban environments is introduced.

## Key findings

- A convolutional neural network achieved 96% accuracy on simulated data and 89.8% on real SIGMA dataset spectra.
- Five clusters of spectra were identified as containing single isotopes, suitable for training.
- The method is validated as effective for generating training data for isotope identification.

## Abstract

Real-time isotope identification in urban environments can aid law enforcement by providing additional information about the nature of a potential threat. Neural networks have shown promise in isotope identification but the large range of potential isotopes, activities and shielding in uncontrolled urban environments makes creating a representative training set challenging. In this work, a method of generating gamma spectra datasets without requiring radioactive sources is validated with representative data. Simulated spectra are added to background radiation taken from a large dataset of unlabelled gamma spectra (the SIGMA dataset) collected in London by AWE Nuclear Security Technologies. A testing set of 12748 spectra was extracted from the SIGMA dataset by applying k-means clustering to the 10% of spectra with the highest gross counts. Manual inspection and labelling of a subset of each cluster showed that five clusters contained single isotopes and two contained multiple isotopes which were discarded. A convolutional neural network classifier was trained and tested using these two datasets. The model was able to identify isotopes from real SIGMA dataset spectra. The lowest prediction accuracy for a given class was 96% when presented with simulated data, and 89.8% on SIGMA dataset spectra. The high prediction accuracy validates the method for generating spectra and facilitates future work increasing the range of isotopes present in the training set and developing more complex models.

## Full-text entities

- **Genes:** CST12P (cystatin 12, pseudogene) [NCBI Gene 106478911] {aka Cst, Ctes4, E2}
- **Diseases:** electronic anomaly (MESH:D028361)
- **Chemicals:** 238U (MESH:D014501), 137Cs (MESH:C000614989), 133Ba (MESH:C000614994), aluminium (MESH:D000535), 226Ra (MESH:C000615152), 18F (MESH:C000615276), 232Th (MESH:C000615164), Tl (MESH:D013793), NaI (MESH:D012974), cesium iodide (MESH:C040050), PONE-D-24-54905 (-), K (MESH:D011188), 60Co (MESH:C000615395), 99mTc (MESH:D013667), 40K (MESH:C000615339)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** 133Ba — Mus musculus (Mouse), Hybridoma (CVCL_J918)

## Figures

19 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12148181/full.md

---
Source: https://tomesphere.com/paper/PMC12148181