Ranking Triples using Entity Links in a Large Web Crawl - The Chicory Triple Scorer at WSDM Cup 2017
Frank Dorssers (1), Arjen P. de Vries (1), Wouter Alink (2), Roberto, Cornacchia (2) ((1) Radboud University, (2) Spinque)

TL;DR
This paper presents a method for ranking triples using entity-linked web data and Wikipedia abstracts, achieving improved relevance estimation for the Triple Ranking Challenge at WSDM Cup 2017.
Contribution
It introduces a novel approach combining large-scale entity-linked web data with a declarative search strategy for triple relevance ranking.
Findings
Utilized ClueWeb12 and FACC1 dataset for relevance estimation.
Implemented an automatic, declarative search strategy for data combination.
Participated successfully in the WSDM Cup 2017 Triple Ranking Challenge.
Abstract
This paper describes the participation of team Chicory in the Triple Ranking Challenge of the WSDM Cup 2017. Our approach deploys a large collection of entity tagged web data to estimate the correctness of the relevance relation expressed by the triples, in combination with a baseline approach using Wikipedia abstracts following [1]. Relevance estimations are drawn from ClueWeb12 annotated by Google's entity linker, available publicly as the FACC1 dataset. Our implementation is automatically generated from a so-called 'search strategy' that specifies declaratively how the input data are combined into a final ranking of triples.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management
