# MiPRIME: an integrated and intelligent platform for mining primer and probe sequences of microbial species

**Authors:** Zhiming Zhang, Jing Ren, Lili Ren, Lanying Zhang, Qubo Ai, Haixin Long, Yi Ren, Kun Yang, Huiying Feng, Sabrina Li, Xu Li

PMC · DOI: 10.1093/bioinformatics/btae429 · Bioinformatics · 2024-07-02

## TL;DR

MiPRIME is a tool that automatically extracts and recommends primers and probes for detecting microorganisms from scientific literature, saving time and effort.

## Contribution

MiPRIME introduces a comprehensive, intelligent platform for mining microbial primer and probe sequences using a high-accuracy text mining model.

## Key findings

- MiPRIME integrates over 40 million articles and 548,942 organisms for microbial gene discovery.
- The platform uses a BioBERT-based model with 98.02% accuracy for efficient primer mining.
- A PRscore system enables intelligent, species-specific primer recommendations.

## Abstract

Accurately detecting pathogenic microorganisms requires effective primers and probe designs. Literature-derived primers are a valuable resource as they have been tested and proven effective in previous research. However, manually mining primers from published texts is time-consuming and limited in species scop.

To address these challenges, we have developed MiPRIME, a real-time Microbial Primer Mining platform for primer/probe sequences extraction of pathogenic microorganisms with three highlights: (i) comprehensive integration. Covering >40 million articles and 548 942 organisms, the platform enables high-frequency microbial gene discovery from a global perspective, facilitating user-defined primer design and advancing microbial research. (ii) Using a BioBERT-based text mining model with 98.02% accuracy, greatly reducing information processing time. (iii) Using a primer ranking score, PRscore, for intelligent recommendation of species-specific primers. Overall, MiPRIME is a practical tool for primer mining in the pan-microbial field, saving time and cost of trial-and-error experiments.

The web is available at {{https://www.ai-bt.com}}.

## Full-text entities

- **Genes:** scpA/B [NCBI Gene 46807553]
- **Diseases:** Covid-19 (MESH:D000086382), influenza A (MESH:D007251), antibiotic (MESH:D004761), respiratory pathogen (MESH:D012131)
- **Chemicals:** agarose (MESH:D012685)
- **Species:** Homo sapiens (human, species) [taxon 9606], Human coronavirus 229E (no rank) [taxon 11137], Candida albicans (species) [taxon 5476], Chlamydia trachomatis (species) [taxon 813], Chlamydia pneumoniae (species) [taxon 83558], Mus musculus (house mouse, species) [taxon 10090], Severe acute respiratory syndrome coronavirus 2 (no rank) [taxon 2697049], Streptococcus pyogenes (species) [taxon 1314], Candida tropicalis (species) [taxon 5482], Streptococcus sp. 'group A' (species) [taxon 36470]
- **Cell lines:** S2 — Drosophila melanogaster (Fruit fly), Spontaneously immortalized cell line (CVCL_Z232)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11246166/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11246166/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/PMC11246166/full.md

---
Source: https://tomesphere.com/paper/PMC11246166