# No-But-Semantic-Match: Computing Semantically Matched XML Keyword Search   Results

**Authors:** Mehdi Naseriparsa, Md. Saiful Islam, Chengfei Liu, Irene Moser

arXiv: 1703.02212 · 2017-03-08

## TL;DR

This paper addresses the no-but-semantic-match problem in XML keyword search by generating semantically related candidate queries using ontologies, and proposes efficient algorithms for top-k result retrieval.

## Contribution

It introduces a novel approach to retrieve semantically related XML keyword search results through candidate query generation and pruning techniques, enhancing result relevance.

## Key findings

- Effective candidate query generation from ontologies.
- Proposed pruning techniques improve efficiency.
- Experimental results confirm effectiveness and scalability.

## Abstract

Users are rarely familiar with the content of a data source they are querying, and therefore cannot avoid using keywords that do not exist in the data source. Traditional systems may respond with an empty result, causing dissatisfaction, while the data source in effect holds semantically related content. In this paper we study this no-but-semantic-match problem on XML keyword search and propose a solution which enables us to present the top-k semantically related results to the user. Our solution involves two steps: (a) extracting semantically related candidate queries from the original query and (b) processing candidate queries and retrieving the top-k semantically related results. Candidate queries are generated by replacement of non-mapped keywords with candidate keywords obtained from an ontological knowledge base. Candidate results are scored using their cohesiveness and their similarity to the original query. Since the number of queries to process can be large, with each result having to be analyzed, we propose pruning techniques to retrieve the top-$k$ results efficiently. We develop two query processing algorithms based on our pruning techniques. Further, we exploit a property of the candidate queries to propose a technique for processing multiple queries in batch, which improves the performance substantially. Extensive experiments on two real datasets verify the effectiveness and efficiency of the proposed approaches.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.02212/full.md

## Figures

50 figures with captions in the complete paper: https://tomesphere.com/paper/1703.02212/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/1703.02212/full.md

---
Source: https://tomesphere.com/paper/1703.02212