# TiFi: Taxonomy Induction for Fictional Domains [Extended version]

**Authors:** Cuong Xuan Chu, Simon Razniewski, Gerhard Weikum

arXiv: 1901.10263 · 2019-01-30

## TL;DR

TiFi is a novel method for constructing accurate taxonomies for fictional domains from noisy sources, outperforming existing approaches and enabling structured knowledge bases in poorly covered areas.

## Contribution

TiFi introduces a three-phase process for taxonomy induction tailored to fictional domains, including category cleaning, edge cleaning, and top-level mapping, with high precision results.

## Key findings

- Successfully constructs taxonomies for diverse fictional domains.
- Outperforms state-of-the-art taxonomy induction methods.
- Achieves high precision in noisy, domain-specific data.

## Abstract

Taxonomies are important building blocks of structured knowledge bases, and their construction from text sources and Wikipedia has received much attention. In this paper we focus on the construction of taxonomies for fictional domains, using noisy category systems from fan wikis or text extraction as input. Such fictional domains are archetypes of entity universes that are poorly covered by Wikipedia, such as also enterprise-specific knowledge bases or highly specialized verticals. Our fiction-targeted approach, called TiFi, consists of three phases: (i) category cleaning, by identifying candidate categories that truly represent classes in the domain of interest, (ii) edge cleaning, by selecting subcategory relationships that correspond to class subsumption, and (iii) top-level construction, by mapping classes onto a subset of high-level WordNet categories. A comprehensive evaluation shows that TiFi is able to construct taxonomies for a diverse range of fictional domains such as Lord of the Rings, The Simpsons or Greek Mythology with very high precision and that it outperforms state-of-the-art baselines for taxonomy induction by a substantial margin.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.10263/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1901.10263/full.md

## References

51 references — full list in the complete paper: https://tomesphere.com/paper/1901.10263/full.md

---
Source: https://tomesphere.com/paper/1901.10263