# Fingerprint Finder: Identifying Genomic Fingerprint Sites in Cotton Cohorts for Genetic Analysis and Breeding Advancement

**Authors:** Shang Liu, Hailiang Cheng, Youping Zhang, Man He, Dongyun Zuo, Qiaolian Wang, Limin Lv, Zhongxv Lin, Guoli Song

PMC · DOI: 10.3390/genes15030378 · 2024-03-19

## TL;DR

A new software called FPFinder identifies important genomic sites in cotton that help with genetic analysis and breeding, especially for fiber length and environmental adaptation.

## Contribution

FPFinder is a novel tool using TF-IDF to detect fingerprint genomic sites in cotton, revealing their roles in development and adaptation.

## Key findings

- FPFinder identified 453 pedigree fingerprint genomic sites important for cotton development.
- Region-specific sites were found to contribute to environmental adaptation in cotton.
- Cultivars from the Yangtze River region had longer fibers due to enriched elite genomic sites.

## Abstract

Genomic data in Gossypium provide numerous data resources for the cotton genomics community. However, to fill the gap between genomic analysis and breeding field work, detecting the featured genomic items of a subset cohort is essential for geneticists. We developed FPFinder v1.0 software to identify a subset of the cohort’s fingerprint genomic sites. The FPFinder was developed based on the term frequency–inverse document frequency algorithm. With the short-read sequencing of an elite cotton pedigree, we identified 453 pedigree fingerprint genomic sites and found that these pedigree-featured sites had a role in cotton development. In addition, we applied FPFinder to evaluate the geographical bias of fiber-length-related genomic sites from a modern cotton cohort consisting of 410 accessions. Enriching elite sites in cultivars from the Yangtze River region resulted in the longer fiber length of Yangze River-sourced accessions. Apart from characterizing functional sites, we also identified 12,536 region-specific genomic sites. Combining the transcriptome data of multiple tissues and samples under various abiotic stresses, we found that several region-specific sites contributed to environmental adaptation. In this research, FPFinder revealed the role of the cotton pedigree fingerprint and region-specific sites in cotton development and environmental adaptation, respectively. The FPFinder can be applied broadly in other crops and contribute to genetic breeding in the future.

## Linked entities

- **Species:** Gossypium (taxon 3633), Mus musculus (taxon 10090)

## Full-text entities

- **Genes:** ubiquitin [NCBI Gene 107951714], MYB [NCBI Gene 107909337], MYB transcription factor [NCBI Gene 107961901]
- **Diseases:** injury to people or property (MESH:C000719191)
- **Chemicals:** S (MESH:D013455), salt (MESH:D012492), N (MESH:D009584)
- **Species:** Oryza sativa (Asian cultivated rice, species) [taxon 4530]
- **Cell lines:** S2 — Drosophila melanogaster (Fruit fly), Spontaneously immortalized cell line (CVCL_Z232)

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC10970022/full.md

---
Source: https://tomesphere.com/paper/PMC10970022