Prioritizing and Scheduling Conferences for Metadata Harvesting in dblp

Mandy Neumann; Christopher Michels; Philipp Schaer; Ralf Schenkel

arXiv:1804.06169·cs.DL·April 18, 2018

Prioritizing and Scheduling Conferences for Metadata Harvesting in dblp

Mandy Neumann, Christopher Michels, Philipp Schaer, Ralf Schenkel

PDF

2 Repos

TL;DR

This paper proposes a method for prioritizing conferences for metadata harvesting in digital libraries by evaluating various ranking features to optimize data source selection.

Contribution

It introduces a novel approach to conference prioritization using a broad definition of information quality and evaluates it with pseudo-relevance assessments.

Findings

01

Certain ranking features significantly improve conference selection accuracy.

02

The proposed approach outperforms baseline methods in identifying promising data sources.

03

Component-based evaluation confirms the effectiveness of the feature set.

Abstract

Maintaining literature databases and online bibliographies is a core responsibility of metadata aggregators such as digital libraries. In the process of monitoring all the available data sources the question arises which data source should be prioritized. Based on a broad definition of information quality we are looking for different ways to find the best fitting and most promising conference candidates to harvest next. We evaluate different conference ranking features by using a pseudo-relevance assessment and a component-based evaluation of our approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.