SimpleQuestions Nearly Solved: A New Upperbound and Baseline Approach
Michael Petrochuk, Luke Zettlemoyer

TL;DR
This paper demonstrates that the SimpleQuestions benchmark is nearly solved by standard methods, with an upper bound of 83.4% accuracy due to inherent ambiguity, and introduces a new baseline achieving 78.1% accuracy.
Contribution
The paper provides new evidence that the SimpleQuestions dataset is nearly solved and introduces a strong baseline with state-of-the-art performance using standard methods.
Findings
Ambiguity limits performance to 83.4%.
New baseline achieves 78.1% accuracy.
Upper bound is loose, with many errors irreducible from linguistic signals.
Abstract
The SimpleQuestions dataset is one of the most commonly used benchmarks for studying single-relation factoid questions. In this paper, we present new evidence that this benchmark can be nearly solved by standard methods. First we show that ambiguity in the data bounds performance on this benchmark at 83.4%; there are often multiple answers that cannot be disambiguated from the linguistic signal alone. Second we introduce a baseline that sets a new state-of-the-art performance level at 78.1% accuracy, despite using standard methods. Finally, we report an empirical analysis showing that the upperbound is loose; roughly a third of the remaining errors are also not resolvable from the linguistic signal. Together, these results suggest that the SimpleQuestions dataset is nearly solved.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
