Simulating realistic short tandem repeat capillary electrophoretic signal using a generative adversarial network
Duncan Taylor, Melissa Humphries

TL;DR
This paper introduces a GAN-based method to generate realistic electrophoretic DNA signal data, facilitating improved training of neural networks for DNA profile analysis by reducing the need for extensive labeled datasets.
Contribution
The study develops a modified pix2pix GAN to simulate realistic DNA electrophoretic signals, enabling efficient training of classifiers without large labeled datasets.
Findings
GAN successfully simulates DNA electrophoretic signals with realistic noise and artefacts.
The generated data can be used to train neural networks for DNA profile analysis.
The approach reduces the need for extensive labeled training data.
Abstract
DNA profiles are made up from multiple series of electrophoretic signal measuring fluorescence over time. Typically, human DNA analysts 'read' DNA profiles using their experience to distinguish instrument noise, artefactual signal, and signal corresponding to DNA fragments of interest. Recent work has developed an artificial neural network, ANN, to carry out the task of classifying fluorescence types into categories in DNA profile electrophoretic signal. But the creation of the necessarily large amount of labelled training data for the ANN is time consuming and expensive, and a limiting factor in the ability to robustly train the ANN. If realistic, prelabelled, training data could be simulated then this would remove the barrier to training an ANN with high efficacy. Here we develop a generative adversarial network, GAN, modified from the pix2pix GAN to achieve this task. With 1078 DNA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFractal and DNA sequence analysis
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · Convolution · Dropout · Batch Normalization · Sigmoid Activation · PatchGAN · HuMan(Expedia)||How do I get a human at Expedia? · Pix2Pix
