# Training Recurrent Neural Networks for BrdU Detection with Oxford Nanopore Sequencing: Guidance and Lessons Learned

**Authors:** Haibo Liu, William Flavahan, Lihua Julie Zhu

PMC · DOI: 10.3390/genes16111356 · Genes · 2025-11-10

## TL;DR

This paper provides a detailed tutorial on training RNNs for detecting BrdU in DNA sequencing data using Oxford Nanopore technology.

## Contribution

A step-by-step guide for preparing training data and implementing deep learning models for BrdU detection with legacy and new sequencing technologies.

## Key findings

- BiGRU-based RNNs achieved high specificity (>94%) for BrdU detection.
- The model's sensitivity was moderate due to limited BrdU-positive training data.
- The protocol is adaptable to newer R10 flow cells and other base modifications.

## Abstract

Background/Objectives: BrdU (5′-bromo-2′-deoxyuridine), a synthetic thymidine (T) analog, is widely used to study cell proliferation and DNA synthesis. To precisely identify where and when DNA replication starts and terminates, it is essential to determine the BrdU incorporation rate and sites at a single-nucleotide resolution. Although several deep learning-based methods have been developed for detecting BrdU using Oxford nanopore sequencing data, there is a lack of accessible, easy-to-follow tutorials to guide researchers in preparing training data and implementing deep learning approaches as the nanopore sequencing technologies continue to evolve. Methods: Due to the lack of ground truth BrdU-positive data generated on the latest R10 flow cells, we prepared model training data from legacy R9 flow cells, consistent with existing tools. We processed publicly available synthetic and real nanopore DNA sequencing datasets, with and without BrdU incorporation, using a combination of open-source and custom software tools. Subsequently, we trained bidirectional gated recurrent unit (BiGRU)-based recurrent neural networks (RNNs) for BrdU detection using the TensorFlow library on the Google Colab platform. Results: We trained BiGRU-based RNNs for BrdU detection with a high specificity (>94%) but a moderate sensitivity due to limited BrdU-positive data. We detail the setup, training, testing, and fine-tuning of the model using both synthetic and real DNA sequencing data. Conclusions: Though the models were trained with data generated on legacy flow cells, we believe that this detailed protocol, covering both data preparation and model development, can be readily extended to R10 flow cells and basecallers for other base modifications. This work will facilitate the broader adoption of deep learning neural networks in biological research, particularly RNNs, which are well suited for modeling sequential and time-series data.

## Linked entities

- **Chemicals:** BrdU (PubChem CID 6035), 5′-bromo-2′-deoxyuridine (PubChem CID 6035), thymidine (PubChem CID 5789)

## Full-text entities

- **Chemicals:** T (MESH:D014316), thymidine (MESH:D013936), 5'-bromo-2'-deoxyuridine (MESH:D001973)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12652529/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12652529/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/PMC12652529/full.md

---
Source: https://tomesphere.com/paper/PMC12652529