Teaching Smaller Language Models To Generalise To Unseen Compositional   Questions

Tim Hartill; Neset Tan; Michael Witbrock; Patricia J. Riddle

arXiv:2308.00946·cs.CL·August 22, 2023

Teaching Smaller Language Models To Generalise To Unseen Compositional Questions

Tim Hartill, Neset Tan, Michael Witbrock, Patricia J. Riddle

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that smaller language models can generalize to unseen compositional questions by combining multitask supervised pretraining with retrieval-augmented training, improving reasoning capabilities without relying solely on large models.

Contribution

It introduces a method for enhancing small model generalization to unseen questions through multitask pretraining and retrieval-augmented datasets, addressing a less explored area in zero-shot reasoning.

Findings

01

Performance improved with retrieval-augmented training datasets.

02

Strong baselines established across multiple diverse datasets.

03

Retrieval-based training enhances reasoning abilities in smaller models.

Abstract

We equip a smaller Language Model to generalise to answering challenging compositional questions that have not been seen in training. To do so we propose a combination of multitask supervised pretraining on up to 93 tasks designed to instill diverse reasoning abilities, and a dense retrieval system that aims to retrieve a set of evidential paragraph fragments. Recent progress in question-answering has been achieved either through prompting methods against very large pretrained Language Models in zero or few-shot fashion, or by fine-tuning smaller models, sometimes in conjunction with information retrieval. We focus on the less explored question of the extent to which zero-shot generalisation can be enabled in smaller models with retrieval against a corpus within which sufficient information to answer a particular question may not exist. We establish strong baselines in this setting for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

timhartill/unseen_questions
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsFocus