# Concept Discovery through Information Extraction in Restaurant Domain

**Authors:** Nadeesha Pathirana, Sandaru Seneviratne, Rangika Samarawickrama, Shane, Wolff, Charith Chitraranjan, Uthayasanker Thayasivam, Tharindu Ranasinghe

arXiv: 1906.05039 · 2019-06-13

## TL;DR

This paper presents an automated method for concept discovery in the restaurant domain using word embeddings, clustering, and classification to build a hierarchical knowledge base and facilitate semi-automatic ontology creation.

## Contribution

It introduces a novel automated approach combining word embedding, clustering, and classification for concept identification in large domains like restaurants.

## Key findings

- Effective concept hierarchy generation
- Automated classification of unseen words
- Potential for semi-automatic ontology creation

## Abstract

Concept identification is a crucial step in understanding and building a knowledge base for any particular domain. However, it is not a simple task in very large domains such as restaurants and hotel. In this paper, a novel approach of identifying a concept hierarchy and classifying unseen words into identified concepts related to restaurant domain is presented. Sorting, identifying, classifying of domain-related words manually is tedious and therefore, the proposed process is automated to a great extent. Word embedding, hierarchical clustering, classification algorithms are effectively used to obtain concepts related to the restaurant domain. Further, this approach can also be extended to create a semi-automatic ontology on restaurant domain.

---
Source: https://tomesphere.com/paper/1906.05039