# TEvarSim: A genome simulator for transposable element (TE) variants

**Authors:** Jian Miao, Dawei Li

PMC · DOI: 10.1371/journal.pcbi.1013933 · PLOS Computational Biology · 2026-01-30

## TL;DR

TEvarSim is a new genome simulator that creates transposable element (TE) variants in synthetic genomes and sequencing data, enabling better benchmarking of TE detection methods.

## Contribution

TEvarSim is the first all-in-one toolkit to simulate TE insertions and deletions, compare simulated and predicted TE variants, and model natural variation at multiple biological levels.

## Key findings

- TEvarSim can rapidly simulate thousands of synthetic genomes with TE variants.
- It supports real-world TE insertions and deletions derived from pangenome graphs.
- The tool streamlines benchmarking of TE detection and genotyping methods.

## Abstract

Transposable element (TE) variants, the presence or absence of TE sequences such as LINE-1, Alu, SVA, and endogenous retroviruses, are a major source of genomic diversity and play critical roles in human health, evolution, and disease. As interest in TE variants grows, developing related methods and tools for detection has become increasingly important. However, rigorous benchmarking of TE variant detection methods remains limited due to the lack of accurate and scalable TE variant simulation platforms and the absence of reliable ground truth data. Here, we developed TEvarSim, a novel TE variant simulator that generates TE-containing genomic data in multiple formats, including genomes, short- and long-read sequencing data, and VCF files. TEvarSim supports both random and real-world TE insertions and deletions, including variants derived from pangenome graphs. It can rapidly simulate hundreds to thousands of synthetic chromosomes or genomes and model natural variation at the haplotype, individual, and population levels, making it well suited for large-scale studies. In addition, TEvarSim can directly compare simulated VCF files with TEs reported by TE detection tools, streamlining the benchmarking of TE genotyping methods. TEvarSim provides an all-in-one toolkit for simulating, evaluating, and improving TE variant detection, advancing our ability to accurately study TEs in health and disease in various species.

TEvarSim is the first all-in-one toolkit that generates transposable element (TE) variant-carrying genomes, sequencing reads, and VCF files, and directly compares simulated and predicted TE variants, providing a comprehensive solution for benchmarking TE detection and genotyping. It is the first simulator to incorporate real TE sequences, extract TEs from pangenome graphs, and introduce natural sequence variation across genomes, supporting biologically relevant TE research. TEvarSim is also the first to directly and automatically simulate both TE insertions and deletions, providing a better understanding of the full spectrum of TE polymorphisms. TEvarSim can rapidly simulate thousands of individual synthetic genomes containing TE variants.

## Full-text entities

- **Diseases:** TE (MESH:C565217), MELT (MESH:D014086), VCF (MESH:D004062)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Mutations:** chr21:8,100,087-8,100,744 

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12875575/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12875575/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/PMC12875575/full.md

---
Source: https://tomesphere.com/paper/PMC12875575