# Interactive Concept Mining on Personal Data -- Bootstrapping Semantic   Services

**Authors:** Markus Schr\"oder, Christian Jilek, Andreas Dengel

arXiv: 1903.05872 · 2019-03-15

## TL;DR

This paper introduces an interactive method for extracting and refining high-level concepts from personal data to improve semantic services, using user feedback and schema-based analysis.

## Contribution

It presents a novel interactive concept mining approach that leverages schemata and user feedback to bootstrap semantic understanding of personal data.

## Key findings

- Prototype demonstrates effective concept candidate ranking
- User feedback refines relevant concepts
- Improves initial semantic representations

## Abstract

Semantic services (e.g. Semantic Desktops) are still afflicted by a cold start problem: in the beginning, the user's personal information sphere, i.e. files, mails, bookmarks, etc., is not represented by the system. Information extraction tools used to kick-start the system typically create 1:1 representations of the different information items. Higher level concepts, for example found in file names, mail subjects or in the content body of these items, are not extracted. Leaving these concepts out may lead to underperformance, having to many of them (e.g. by making every found term a concept) will clutter the arising knowledge graph with non-helpful relations. In this paper, we present an interactive concept mining approach proposing concept candidates gathered by exploiting given schemata of usual personal information management applications and analysing the personal information sphere using various metrics. To heed the subjective view of the user, a graphical user interface allows to easily rank and give feedback on proposed concept candidates, thus keeping only those actually considered relevant. A prototypical implementation demonstrates major steps of our approach.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.05872/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1903.05872/full.md

## References

10 references — full list in the complete paper: https://tomesphere.com/paper/1903.05872/full.md

---
Source: https://tomesphere.com/paper/1903.05872