The Dawn of Open Access to Phylogenetic Data
Andrew F. Magee, Michael R. May, Brian R. Moore

TL;DR
This paper highlights the importance of open access to phylogenetic data, analyzes the current state of data sharing, and identifies factors influencing data availability, emphasizing the need for improved policies and practices.
Contribution
It provides an empirical assessment of phylogenetic data sharing practices over 13 years and identifies key factors affecting data availability.
Findings
Approximately 40% of phylogenetic data are effectively lost.
Data sharing is more successful with journals having strong policies and higher impact factors.
Data requests from faculty are more successful than from students.
Abstract
The scientific enterprise depends critically on the preservation of and open access to published data. This basic tenet applies acutely to phylogenies (estimates of evolutionary relationships among species). Increasingly, phylogenies are estimated from increasingly large, genome-scale datasets using increasingly complex statistical methods that require increasing levels of expertise and computational investment. Moreover, the resulting phylogenetic data provide an explicit historical perspective that critically informs research in a vast and growing number of scientific disciplines. One such use is the study of changes in rates of lineage diversification (speciation - extinction) through time. As part of a meta-analysis in this area, we sought to collect phylogenetic data (comprising nucleotide sequence alignment and tree files) from 217 studies published in 46 journals over a 13-year…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
