TL;DR
This paper presents a pipeline for creating publication knowledge graphs in the arts, humanities, and social sciences, aiming to improve discoverability and analytics in these underrepresented research areas.
Contribution
It introduces an open-source pipeline that extracts, disambiguates, normalizes, and exports structured bibliographic data as linked data for AHSS publications.
Findings
Successfully tested on Brill's Classics collection
Pipeline effectively disambiguates and normalizes bibliographic data
Open source implementation available for community use
Abstract
The digital transformation of the scientific publishing industry has led to dramatic improvements in content discoverability and information analytics. Unfortunately, these improvements have not been uniform across research areas. The scientific literature in the arts, humanities and social sciences (AHSS) still lags behind, in part due to the scale of analog backlogs, the persisting importance of national languages, and a publisher ecosystem made of many, small or medium enterprises. We propose a bottom-up approach to support publishers in creating and maintaining their own publication knowledge graphs in the open domain. We do so by releasing a pipeline able to extract structured information from the bibliographies and indexes of AHSS publications, disambiguate, normalize and export it as linked data. We test the proposed pipeline on Brill's Classics collection, and release an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
