Measuring the diversity of data and metadata in digital libraries
Rafael C. Carrasco, Gustavo Candela, Manuel Marco-Such

TL;DR
This paper explores the application of diversity indices to digital libraries for analyzing data and metadata variability, offering a robust way to identify trends and compare content across collections.
Contribution
It introduces the use of biodiversity-inspired diversity indices to measure and analyze the variability of data and metadata in digital libraries.
Findings
Diversity indices effectively capture variability in digital library content.
These indices can identify trends and differences in topics and metadata coverage.
The approach provides a robust alternative to abundance-based measures.
Abstract
Diversity indices have been traditionally used to capture the biodiversity of ecosystems by measuring the effective number of species or groups of species. In contrast to abundance, which is correlated with the amount of data available, diversity indices provide a more robust indicator on the variability of individuals. These types of indices can be employed in the context of digital libraries to identify trends in the distribution of topics, compare the lexica employed by different authors or analyze the coverage of semantic metadata
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpecies Distribution and Climate Change · Semantic Web and Ontologies · Environmental DNA in Biodiversity Studies
