What's the best place for an AI conference, Vancouver or ______: Why   completing comparative questions is difficult

Avishai Zagoury; Einat Minkov; Idan Szpektor; William W.; Cohen

arXiv:2104.01940·cs.CL·April 6, 2021

What's the best place for an AI conference, Vancouver or ______: Why completing comparative questions is difficult

Avishai Zagoury, Einat Minkov, Idan Szpektor, William W., Cohen

PDF

Open Access 1 Video

TL;DR

This paper investigates neural language models' ability to complete comparative questions, revealing they are domain-specific and rely on co-occurrence patterns rather than understanding semantic comparability, highlighting challenges in assessing world knowledge.

Contribution

The study provides a detailed analysis of LMs' performance on comparative question completion, showing their limitations in modeling semantic similarity and broad reasoning.

Findings

01

Models achieve near-human performance in specific domains.

02

Performance correlates with entity co-occurrence in training data.

03

Models lack a general understanding of semantic comparability.

Abstract

Although large neural language models (LMs) like BERT can be finetuned to yield state-of-the-art results on many NLP tasks, it is often unclear what these models actually learn. Here we study using such LMs to fill in entities in human-authored comparative questions, like ``Which country is older, India or ______?'' -- i.e., we study the ability of neural LMs to ask (not answer) reasonable questions. We show that accuracy in this fill-in-the-blank task is well-correlated with human judgements of whether a question is reasonable, and that these models can be trained to achieve nearly human-level performance in completing comparative questions in three different subdomains. However, analysis shows that what they learn fails to model any sort of broad notion of which entities are semantically comparable or similar -- instead the trained models are very domain-specific, and performance is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

What's the Best Place for an AI Conference, Vancouver or _______: Why Completing Comparative Questions Is Difficult· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsLinear Layer · Linear Warmup With Linear Decay · Residual Connection · Layer Normalization · Adam · Multi-Head Attention · Attention Dropout · Dense Connections · Softmax · Dropout