# Weakly Supervised Domain Detection

**Authors:** Yumo Xu, Mirella Lapata

arXiv: 1907.11499 · 2019-07-29

## TL;DR

This paper introduces domain detection as a new NLP task, aiming to identify domain-heavy text segments to improve text classification robustness and versatility across languages and genres.

## Contribution

It proposes an encoder-detector framework with multiple instance learning for weakly supervised, hierarchical, multilabel domain detection applicable to various text granularities.

## Key findings

- Model performs well with minimal supervision
- Applicable across multiple languages and genres
- Enhances text summarization capabilities

## Abstract

In this paper we introduce domain detection as a new natural language processing task. We argue that the ability to detect textual segments which are domain-heavy, i.e., sentences or phrases which are representative of and provide evidence for a given domain could enhance the robustness and portability of various text classification applications. We propose an encoder-detector framework for domain detection and bootstrap classifiers with multiple instance learning (MIL). The model is hierarchically organized and suited to multilabel classification. We demonstrate that despite learning with minimal supervision, our model can be applied to text spans of different granularities, languages, and genres. We also showcase the potential of domain detection for text summarization.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.11499/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1907.11499/full.md

## References

51 references — full list in the complete paper: https://tomesphere.com/paper/1907.11499/full.md

---
Source: https://tomesphere.com/paper/1907.11499