# SMILK, linking natural language and data from the web

**Authors:** C\'edric Lopez (TEXTE), Molka Dhouib (I3S, WIMMICS), Elena Cabrio, (WIMMICS), Catherine Faron Zucker (I3S, WIMMICS), Fabien Gandon (UCA,, WIMMICS), Fr\'ed\'erique Segond

arXiv: 1901.02055 · 2019-01-09

## TL;DR

This paper presents SMILK, a system that links natural language data with web data to enrich knowledge bases and improve text analysis, demonstrated through a cosmetics brand information retrieval application.

## Contribution

It introduces ProVoc, an ontology for products and brands, and a method for automatically populating a knowledge base from diverse textual sources.

## Key findings

- Effective brand-related information retrieval demonstrated
- Successful creation and population of ProVoc ontology
- Browser plugin provides enriched knowledge to users

## Abstract

As part of the SMILK Joint Lab, we studied the use of Natural Language Processing to: (1) enrich knowledge bases and link data on the web, and conversely (2) use this linked data to contribute to the improvement of text analysis and the annotation of textual content, and to support knowledge extraction. The evaluation focused on brand-related information retrieval in the field of cosmetics. This article describes each step of our approach: the creation of ProVoc, an ontology to describe products and brands; the automatic population of a knowledge base mainly based on ProVoc from heterogeneous textual resources; and the evaluation of an application which that takes the form of a browser plugin providing additional knowledge to users browsing the web.

---
Source: https://tomesphere.com/paper/1901.02055