Categorizing Comparative Sentences

Alexander Panchenko; Alexander Bondarenko; Mirco Franzek; Matthias; Hagen; Chris Biemann

arXiv:1809.06152·cs.CL·July 9, 2019

Categorizing Comparative Sentences

Alexander Panchenko, Alexander Bondarenko, Mirco Franzek, Matthias, Hagen, Chris Biemann

PDF

3 Repos

TL;DR

This paper presents a method for automatically identifying and categorizing comparative sentences, achieving high accuracy with a gradient boosting model trained on a large annotated dataset, useful for argumentation and search engines.

Contribution

It introduces a new annotated dataset of 7,199 sentences and a gradient boosting approach for effective comparative sentence classification.

Findings

01

F1 score of 85% on comparative sentence detection

02

Large annotated dataset of 7,199 sentences

03

Model suitable for argumentation and search applications

Abstract

We tackle the tasks of automatically identifying comparative sentences and categorizing the intended preference (e.g., "Python has better NLP libraries than MATLAB" => (Python, better, MATLAB). To this end, we manually annotate 7,199 sentences for 217 distinct target item pairs from several domains (27% of the sentences contain an oriented comparison in the sense of "better" or "worse"). A gradient boosting model based on pre-trained sentence embeddings reaches an F1 score of 85% in our experimental evaluation. The model can be used to extract comparative sentences for pro/con argumentation in comparative / argument search engines or debating technologies.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.