FAIRification of MLC data
Ana Kostovska, Jasmin Bogatinovski, Andrej Treven, Sa\v{s}o, D\v{z}eroski, Dragi Kocev, Pan\v{c}e Panov

TL;DR
This paper promotes the FAIR and TRUST data principles for multi-label classification datasets by creating an ontology-based online catalogue that enhances dataset description, accessibility, and benchmarking transparency.
Contribution
It introduces an ontology-based online catalogue for MLC datasets and benchmark data, improving data management and reproducibility in the field.
Findings
Extensive description of MLC datasets with meta-features and semantic info
An ontology-based system for querying benchmark performance data
Enhanced transparency and accessibility of MLC datasets and benchmarks
Abstract
The multi-label classification (MLC) task has increasingly been receiving interest from the machine learning (ML) community, as evidenced by the growing number of papers and methods that appear in the literature. Hence, ensuring proper, correct, robust, and trustworthy benchmarking is of utmost importance for the further development of the field. We believe that this can be achieved by adhering to the recently emerged data management standards, such as the FAIR (Findable, Accessible, Interoperable, and Reusable) and TRUST (Transparency, Responsibility, User focus, Sustainability, and Technology) principles. To FAIRify the MLC datasets, we introduce an ontology-based online catalogue of MLC datasets that follow these principles. The catalogue extensively describes many MLC datasets with comprehensible meta-features, MLC-specific semantic descriptions, and different data provenance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Research Data Management Practices · Data Quality and Management
