Mining User Queries with Information Extraction Methods and Linked Data

Anne Chardonnens; Ettore Rizza; Mathias Coeckelbergs; Seth van Hooland

arXiv:1709.07782·cs.IR·September 25, 2017

Mining User Queries with Information Extraction Methods and Linked Data

Anne Chardonnens, Ettore Rizza, Mathias Coeckelbergs, Seth van Hooland

PDF

1 Repo

TL;DR

This study explores using information extraction and Linked Data to analyze large volumes of user queries from web analytics, aiming to identify user interests in persons and places automatically.

Contribution

It demonstrates the potential and limitations of applying information extraction and knowledge bases to understand user queries in an automated, large-scale setting.

Findings

01

Successfully identified most person and place names in queries.

02

Limited ambiguity remained due to query character and knowledge base limitations.

03

Methods are generalisable and applicable to other collections.

Abstract

Purpose: Advanced usage of Web Analytics tools allows to capture the content of user queries. Despite their relevant nature, the manual analysis of large volumes of user queries is problematic. This paper demonstrates the potential of using information extraction techniques and Linked Data to gather a better understanding of the nature of user queries in an automated manner. Design/methodology/approach: The paper presents a large-scale case-study conducted at the Royal Library of Belgium consisting of a data set of 83 854 queries resulting from 29 812 visits over a 12 month period of the historical newspapers platform BelgicaPress. By making use of information extraction methods, knowledge bases and various authority files, this paper presents the possibilities and limits to identify what percentage of end users are looking for person and place names. Findings: Based on a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ulbstic/BelgicaPress
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.