# Development of Publicly Available Forensic DNA Sequence Mixture Data

**Authors:** Erica L. Romsos, Kevin M. Kiesler, Carolyn R. Steffen, Lisa A. Borsuk, Sarah Riman, Lauren E. Mullen, Jodi A. Irwin, Peter M. Vallone, Katherine B. Gettings

PMC · DOI: 10.3390/genes16030333 · Genes · 2025-03-12

## TL;DR

Researchers created and shared a set of forensic DNA mixture samples to help improve DNA analysis software for criminal investigations.

## Contribution

A publicly available dataset of complex DNA mixtures was developed to support probabilistic genotyping software development.

## Key findings

- A 96-well plate with mixtures of 1-5% minor contributors was designed for sensitivity and reproducibility testing.
- Degraded DNA mixtures and multi-person mixtures with varying ratios were included to reflect real-world forensic scenarios.
- FASTQ data and metadata are publicly accessible for three sequencing kits targeting forensic markers.

## Abstract

Background: In 2018, the Next-Generation Sequencing Committee of SWGDAM queried bioinformatic and statistical interpretation method developers regarding data needs for the development of sequence-based probabilistic genotyping software. Methods: Based on this engagement, a set of 74 mixture samples was conceived and created using 11 single-source samples. The allelic overlap among these samples was evaluated and sample combinations of varying complexity were selected, aiming to represent the variability observed in forensic casework. Results: The samples were distributed into a 96-well plate design containing several features: (1) three-person mixtures of 1% to 5% minor components in triplicate with varying levels of input DNA to provide information on sensitivity and reproducibility, (2) three-person mixtures containing degraded DNA of either only the major contributor or all three contributors, (3) four- and five-person mixtures with varying ratios and donors, (4) a single-source dilution series. Conclusions: Mixture samples were prepared and have been sequenced thus far with three commercially available kits targeting forensic short tandem repeat (STR) and single nucleotide polymorphism (SNP) markers, with FASTQ data files and metadata publicly available at doi.org/10.18434/M32157.

## Full-text entities

- **Diseases:** injury to (MESH:D014947)
- **Chemicals:** PFA (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Mutations:** A through F

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11941798/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11941798/full.md

## References

33 references — full list in the complete paper: https://tomesphere.com/paper/PMC11941798/full.md

---
Source: https://tomesphere.com/paper/PMC11941798