Information-theoretic signatures of biodiversity in the barcoding gene
Valmir C. Barbosa

TL;DR
This paper uses information theory to analyze the COI gene across animal phyla, revealing signatures that correlate with biodiversity and species richness, providing new insights into evolutionary processes.
Contribution
It introduces a novel information-theoretic approach to characterize biodiversity using DNA barcoding data, linking gene signatures to species richness.
Findings
Total correlation descriptors distinguish phyla.
Principal component correlates with log of known species.
Signatures reflect evolutionary biodiversity processes.
Abstract
The COI mitochondrial gene is present in all animal phyla and in a few others, and is the leading candidate for species identification through DNA barcoding. Calculating a generalized form of total correlation on publicly available data on the gene yields distinctive information-theoretic descriptors of the phyla represented in the data. Moreover, performing principal component analysis on standardized versions of these descriptors reveals a strong correlation between the first principal component and the natural logarithm of the number of known living species. The descriptors thus constitute clear information-theoretic signatures of the processes whereby evolution has given rise to current biodiversity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
