# Toward a Species Search Engine: KISSE Offers a Rigorous Statistical Framework for Bone Collagen Tandem Mass Spectrometry Data

**Authors:** Hassan Gharibi, Amir Ata Saei, Alexey L. Chernobrovkin, Susanna L. Lundstrom, Hezheng Lyu, Zhaowei Meng, Akos Vegvari, Massimilliano Gaetani, Roman A. Zubarev

PMC · DOI: 10.1002/advs.202503963 · Advanced Science · 2025-08-11

## TL;DR

KISSE is a new statistical method for identifying species from ancient or degraded samples using collagen peptides and a curated database.

## Contribution

KISSE introduces a rigorous statistical framework for species identification from collagen data, extending methods from proteomics.

## Key findings

- KISSE uses a species-specific library of collagen peptides and their abundances for species identification.
- The method provides a probability of correct identification and can detect when a species is not in the library.
- The approach can be extended to other tissues beyond bone collagen.

## Abstract

DNA and bone collagen are two key sources of resilient molecular markers used to identify species from their remains. Collagen is more stable than DNA, and thus it is preferred for ancient and degraded samples. Current mass spectrometry‐based collagen sequencing approaches are empirical and lack a rigorous statistical framework. Based on the well‐developed approaches to protein identification in shotgun proteomics, a first approximation of the species search engine (SSE) is introduced. SSE named KISSE is based on a species‐specific library of collagenous peptides that uses both peptide sequences and their relative abundances. The developed statistical model can identify the species and the probability of correct identification, as well as determine the likelihood of the analyzed species not being in the library. The advantages and limitations of the proposed approach, and the possibility of extending it to other tissues is discussed.

The Species Search Engine (KISSE) is a novel statistical approach for identifying species from collagen peptides, using a curated library of sequences and their relative abundances derived from shotgun proteomics.

## Full-text entities

- **Diseases:** ID (MESH:C537985), HCD (MESH:D004213), ETD (MESH:D054069)
- **Chemicals:** HCl (MESH:D006851), FA (MESH:C030544), Bicinchoninic acid (MESH:C047117), amino acid (MESH:D000596), calcium (MESH:D002118), ammonium bicarbonate (MESH:C027043), Peptides (MESH:D010455), glutamine (MESH:D005973), methionine (MESH:D008715), BCA (-), acetonitrile (MESH:C032159), lysine (MESH:D008239), C- (MESH:D002244), proline (MESH:D011392), asparagine (MESH:D001216), water (MESH:D014867)
- **Species:** Balaena mysticetus (bowhead, species) [taxon 27602], Halichoerus grypus (gray seal, species) [taxon 9711], Phocidae (crawling seals, family) [taxon 9709], Mirounga leonina (Southern elephant seal, species) [taxon 9715], Ursidae (bears, family) [taxon 9632], Gallus gallus (bantam, species) [taxon 9031], Phoca vitulina (harbor seal, species) [taxon 9720], Cetacea (cetaceans, infraorder) [taxon 9721], Ziphius cavirostris (Cuvier's beaked whale, species) [taxon 9760], Eschrichtius robustus (California gray whale, species) [taxon 9764], Ursus arctos (brown bear, species) [taxon 9644], Saccharomyces cerevisiae (baker's yeast, species) [taxon 4932], Physeter macrocephalus (sperm whale, species) [taxon 9755], Ursus maritimus (polar bear, species) [taxon 29073], Rangifer tarandus (caribou, species) [taxon 9870], Hydrodamalis gigas (Steller's sea cow, species) [taxon 63631], Mammuthus primigenius (mammoth, species) [taxon 37349], Cygnus cygnus (common whooper, species) [taxon 219595], Falco peregrinus (peregrine, species) [taxon 8954], Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12561455/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12561455/full.md

## References

54 references — full list in the complete paper: https://tomesphere.com/paper/PMC12561455/full.md

---
Source: https://tomesphere.com/paper/PMC12561455