Efficient Knowledge Graph Accuracy Evaluation

Junyang Gao; Xian Li; Yifan Ethan Xu; Bunyamin Sisman; Xin Luna Dong,; Jun Yang

arXiv:1907.09657·cs.DB·July 24, 2019·5 cites

Efficient Knowledge Graph Accuracy Evaluation

Junyang Gao, Xian Li, Yifan Ethan Xu, Bunyamin Sisman, Xin Luna Dong,, Jun Yang

PDF

Open Access

TL;DR

This paper introduces an efficient sampling framework for large-scale knowledge graph accuracy evaluation, significantly reducing human annotation costs while maintaining statistical reliability.

Contribution

It proposes novel sampling strategies, including cluster, stratified, and weighted sampling, for cost-effective and incremental accuracy evaluation of evolving knowledge graphs.

Findings

01

Up to 60% cost reduction in static KG evaluation

02

Up to 80% cost reduction in evolving KG evaluation

03

Maintains evaluation quality with reduced human effort

Abstract

Estimation of the accuracy of a large-scale knowledge graph (KG) often requires humans to annotate samples from the graph. How to obtain statistically meaningful estimates for accuracy evaluation while keeping human annotation costs low is a problem critical to the development cycle of a KG and its practical applications. Surprisingly, this challenging problem has largely been ignored in prior research. To address the problem, this paper proposes an efficient sampling and evaluation framework, which aims to provide quality accuracy evaluation with strong statistical guarantee while minimizing human efforts. Motivated by the properties of the annotation cost function observed in practice, we propose the use of cluster sampling to reduce the overall cost. We further apply weighted and two-stage sampling as well as stratification for better sampling designs. We also extend our framework to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Data Stream Mining Techniques · Data Quality and Management