Answering Unseen Questions With Smaller Language Models Using Rationale   Generation and Dense Retrieval

Tim Hartill; Diana Benavides-Prado; Michael Witbrock; Patricia J.; Riddle

arXiv:2308.04711·cs.CL·October 16, 2023·1 cites

Answering Unseen Questions With Smaller Language Models Using Rationale Generation and Dense Retrieval

Tim Hartill, Diana Benavides-Prado, Michael Witbrock, Patricia J., Riddle

PDF

Open Access

TL;DR

This paper enhances small language models' reasoning on unseen questions by combining rationales from larger models with dense retrieval, significantly improving accuracy across multiple datasets.

Contribution

Introduces two methods, Rationale Ranking and Retrieval-Augmented Training, to improve small models' reasoning by integrating generated rationales and retrieved context.

Findings

01

Significant accuracy improvements on multiple datasets.

02

Outperforms larger models in few-shot reasoning tasks.

03

Effective combination of rationales and retrieval enhances reasoning.

Abstract

When provided with sufficient explanatory context, smaller Language Models have been shown to exhibit strong reasoning ability on challenging short-answer question-answering tasks where the questions are unseen in training. We evaluate two methods for further improvement in this setting. Both methods focus on combining rationales generated by a larger Language Model with longer contexts created from a multi-hop dense retrieval system. The first method ( $RR$ ) involves training a Rationale Ranking model to score both generated rationales and retrieved contexts with respect to relevance and truthfulness. We then use the scores to derive combined contexts from both knowledge sources using a number of combinatory strategies. For the second method ( $RATD$ ) we utilise retrieval-augmented training datasets developed by Hartill et al. 2023 to train a smaller Reasoning model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsFocus