Ranking Archived Documents for Structured Queries on Semantic Layers
Pavlos Fafalios, Vaibhav Kasturia, Wolfgang Nejdl

TL;DR
This paper addresses the challenge of ranking archived documents retrieved via structured semantic queries by proposing models that consider relevance, timeliness, and entity relations, improving retrieval effectiveness.
Contribution
It formalizes the task of ranking documents for structured semantic queries and introduces two models that incorporate relevance, timeliness, and entity relations.
Findings
Proposed ranking models outperform baseline methods.
Models effectively incorporate temporal and semantic information.
Experimental results highlight model limitations and areas for improvement.
Abstract
Archived collections of documents (like newspaper and web archives) serve as important information sources in a variety of disciplines, including Digital Humanities, Historical Science, and Journalism. However, the absence of efficient and meaningful exploration methods still remains a major hurdle in the way of turning them into usable sources of information. A semantic layer is an RDF graph that describes metadata and semantic information about a collection of archived documents, which in turn can be queried through a semantic query language (SPARQL). This allows running advanced queries by combining metadata of the documents (like publication date) and content-based semantic information (like entities mentioned in the documents). However, the results returned by such structured queries can be numerous and moreover they all equally match the query. In this paper, we deal with this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
