How Optimal is Greedy Decoding for Extractive Question Answering?

Or Castel; Ori Ram; Avia Efrat; Omer Levy

arXiv:2108.05857·cs.CL·November 9, 2022·1 cites

How Optimal is Greedy Decoding for Extractive Question Answering?

Or Castel, Ori Ram, Avia Efrat, Omer Levy

PDF

Open Access 1 Repo 3 Models

TL;DR

This paper evaluates the effectiveness of greedy decoding in extractive question answering, showing that with minimal training, it closely approximates the optimal span-finding algorithm, especially when models are biased towards extractiveness.

Contribution

The study introduces the exact-extract algorithm for optimal span decoding and compares it with greedy decoding, revealing how training influences their relative performance.

Findings

01

Exact-extract outperforms greedy decoding in zero-shot settings.

02

Greedy decoding improves rapidly with few training examples.

03

Self-supervised training biases models towards extractive answers.

Abstract

Fine-tuned language models use greedy decoding to answer reading comprehension questions with relative success. However, this approach does not ensure that the answer is a span in the given passage, nor does it guarantee that it is the most probable one. Does greedy decoding actually perform worse than an algorithm that does adhere to these properties? To study the performance and optimality of greedy decoding, we present exact-extract, a decoding algorithm that efficiently finds the most probable answer span in the context. We compare the performance of T5 with both decoding algorithms on zero-shot and few-shot extractive question answering. When no training examples are available, exact-extract significantly outperforms greedy decoding. However, greedy decoding quickly converges towards the performance of exact-extract with the introduction of a few training examples, becoming more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ocastel/exact-extract
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Inverse Square Root Schedule · SentencePiece · Adafactor · Dropout · Softmax