Sentence Pair Scoring: Towards Unified Framework for Text Comprehension
Petr Baudi\v{s}, Jan Pichl, Tom\'a\v{s} Vysko\v{c}il, Jan \v{S}ediv\'y

TL;DR
This paper unifies various sentence pair scoring tasks under a common framework, introduces new baselines, datasets, and evaluation methods, and achieves state-of-the-art results on the Ubuntu Dialogue dataset.
Contribution
It proposes a unified framework for sentence pair scoring, introduces new challenging datasets, and provides open-source tools for multi-task learning and evaluation.
Findings
New baselines outperform previous models on multiple tasks.
Proposed evaluation methodology improves model comparison accuracy.
Achieved state-of-the-art performance on the Ubuntu Dialogue dataset.
Abstract
We review the task of Sentence Pair Scoring, popular in the literature in various forms - viewed as Answer Sentence Selection, Semantic Text Scoring, Next Utterance Ranking, Recognizing Textual Entailment, Paraphrasing or e.g. a component of Memory Networks. We argue that all such tasks are similar from the model perspective and propose new baselines by comparing the performance of common IR metrics and popular convolutional, recurrent and attention-based neural models across many Sentence Pair Scoring tasks and datasets. We discuss the problem of evaluating randomized models, propose a statistically grounded methodology, and attempt to improve comparisons by releasing new datasets that are much harder than some of the currently used well explored benchmarks. We introduce a unified open source software framework with easily pluggable models and tasks, which enables us to experiment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
