# Fine-Grained Argument Unit Recognition and Classification

**Authors:** Dietrich Trautmann, Johannes Daxenberger, Christian Stab, Hinrich, Sch\"utze, Iryna Gurevych

arXiv: 1904.09688 · 2019-11-22

## TL;DR

This paper introduces a fine-grained sequence labeling approach for argument recognition and classification, improving recall and robustness over sentence-level methods, supported by a new dataset and high-performing models.

## Contribution

It proposes the Argument Unit Recognition and Classification (AURC) task, a new dataset AURC-8, and effective methods that outperform previous sentence-level approaches.

## Key findings

- AURC-8 contains 15% more arguments per topic than sentence-level annotations.
- Methods achieve near-human performance on known domains.
- Approach is more robust against sentence segmentation errors.

## Abstract

Prior work has commonly defined argument retrieval from heterogeneous document collections as a sentence-level classification task. Consequently, argument retrieval suffers both from low recall and from sentence segmentation errors making it difficult for humans and machines to consume the arguments. In this work, we argue that the task should be performed on a more fine-grained level of sequence labeling. For this, we define the task as Argument Unit Recognition and Classification (AURC). We present a dataset of arguments from heterogeneous sources annotated as spans of tokens within a sentence, as well as with a corresponding stance. We show that and how such difficult argument annotations can be effectively collected through crowdsourcing with high interannotator agreement. The new benchmark, AURC-8, contains up to 15% more arguments per topic as compared to annotations on the sentence level. We identify a number of methods targeted at AURC sequence labeling, achieving close to human performance on known domains. Further analysis also reveals that, contrary to previous approaches, our methods are more robust against sentence segmentation errors. We publicly release our code and the AURC-8 dataset.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.09688/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1904.09688/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/1904.09688/full.md

---
Source: https://tomesphere.com/paper/1904.09688