Metadata Enrichment of Multi-Disciplinary Digital Library: A Semantic-based Approach
Hussein T. Al-Natsheh, Lucie Martinet, Fabrice Muhlenbach, Fabien, Rico, Djamel A. Zighed

TL;DR
This paper presents a semantic-based pipeline for enriching metadata in digital libraries, improving cross-disciplinary article retrieval by leveraging semantic relevance and scalability, and introduces a new benchmark dataset.
Contribution
It introduces a novel multi-label semantic tagging pipeline for digital libraries, enhancing metadata quality and retrieval across disciplines, with a new benchmark dataset and open-source code.
Findings
Higher accuracy than query expansion methods
Improved scalability over comparable techniques
Created a new benchmark dataset for future research
Abstract
In the scientific digital libraries, some papers from different research communities can be described by community-dependent keywords even if they share a semantically similar topic. Articles that are not tagged with enough keyword variations are poorly indexed in any information retrieval system which limits potentially fruitful exchanges between scientific disciplines. In this paper, we introduce a novel experimentally designed pipeline for multi-label semantic-based tagging developed for open-access metadata digital libraries. The approach starts by learning from a standard scientific categorization and a sample of topic tagged articles to find semantically relevant articles and enrich its metadata accordingly. Our proposed pipeline aims to enable researchers reaching articles from various disciplines that tend to use different terminologies. It allows retrieving semantically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Biomedical Text Mining and Ontologies · Information Retrieval and Search Behavior
