Publisher References in Bibliographic Entity Descriptions
Jim Hahn

TL;DR
This paper presents a data mining approach using MARC21 library records to improve publisher reference access in linked data RDF editors, enabling auto-suggestion features for publisher entity descriptions.
Contribution
It introduces a novel method for discovering publisher entity sets and creating association rules from large-scale library metadata to enhance linked data editing tools.
Findings
Developed a prediction database for publisher entities.
Created publisher location and name association rules.
Implemented a prototype autosuggestion feature for RDF editors.
Abstract
This paper describes a method for improved access to publisher references in linked data RDF editors using data mining techniques and a large set of library metadata encoded in the MARC21 standard. The corpus is comprised of clustered sets of publishers and publisher locations from the library MARC21 records found in the POD Data Lake, an Ivy+ Library Consortium metadata sharing initiative. The POD Data Lake contains seventy million MARC21 records, forty million of which are unique. The discovery of publisher entity sets described forms the basis for the streamlined description of BIBFRAME Instance entities. This study resulted in two major outputs: 1) A prediction database and 2) sets of publisher location and name association rules. The association rules are the basis of a prototype autosuggestion feature of BIBFRAME Instance entity description properties designed specifically to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
