Ranking Triples using Entity Links in a Large Web Crawl - The Chicory   Triple Scorer at WSDM Cup 2017

Frank Dorssers (1); Arjen P. de Vries (1); Wouter Alink (2); Roberto; Cornacchia (2) ((1) Radboud University; (2) Spinque)

arXiv:1712.08355·cs.IR·December 25, 2017·2 cites

Ranking Triples using Entity Links in a Large Web Crawl - The Chicory Triple Scorer at WSDM Cup 2017

Frank Dorssers (1), Arjen P. de Vries (1), Wouter Alink (2), Roberto, Cornacchia (2) ((1) Radboud University, (2) Spinque)

PDF

Open Access

TL;DR

This paper presents a method for ranking triples using entity-linked web data and Wikipedia abstracts, achieving improved relevance estimation for the Triple Ranking Challenge at WSDM Cup 2017.

Contribution

It introduces a novel approach combining large-scale entity-linked web data with a declarative search strategy for triple relevance ranking.

Findings

01

Utilized ClueWeb12 and FACC1 dataset for relevance estimation.

02

Implemented an automatic, declarative search strategy for data combination.

03

Participated successfully in the WSDM Cup 2017 Triple Ranking Challenge.

Abstract

This paper describes the participation of team Chicory in the Triple Ranking Challenge of the WSDM Cup 2017. Our approach deploys a large collection of entity tagged web data to estimate the correctness of the relevance relation expressed by the triples, in combination with a baseline approach using Wikipedia abstracts following [1]. Relevance estimations are drawn from ClueWeb12 annotated by Google's entity linker, available publicly as the FACC1 dataset. Our implementation is automatically generated from a so-called 'search strategy' that specifies declaratively how the input data are combined into a final ranking of triples.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management