Finding the Answers with Definition Models

Jack Parry

arXiv:1809.00224·cs.AI·September 5, 2018

Finding the Answers with Definition Models

Jack Parry

PDF

Open Access

TL;DR

This paper enhances a neural definition model for answering crossword questions by applying bidirectional LSTMs, averaging states, expanding training data, and using sub-word units, leading to improved performance over previous models.

Contribution

It introduces specific extensions to the neural definition model, including bidirectional LSTMs and sub-word segmentation, to improve crossword question answering accuracy.

Findings

01

Extensions improve model performance on crossword questions.

02

Increased training data enhances results.

03

Sub-word segmentation benefits model accuracy.

Abstract

Inspired by a previous attempt to answer crossword questions using neural networks (Hill, Cho, Korhonen, & Bengio, 2015), this dissertation implements extensions to improve the performance of this existing definition model on the task of answering crossword questions. A discussion and evaluation of the original implementation finds that there are some ways in which the recurrent neural model could be extended. Insights from related fields neural language modeling and neural machine translation provide the justification and means required for these extensions. Two extensions are applied to the LSTM encoder, first taking the average of LSTM states across the sequence and secondly using a bidirectional LSTM, both implementations serve to improve model performance on a definitions and crossword test set. In order to improve performance on crossword questions, the training data is increased…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory