APOLLO: A Simple Approach for Adaptive Pretraining of Language Models   for Logical Reasoning

Soumya Sanyal; Yichong Xu; Shuohang Wang; Ziyi Yang; Reid Pryzant,; Wenhao Yu; Chenguang Zhu; Xiang Ren

arXiv:2212.09282·cs.CL·June 6, 2023·1 cites

APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning

Soumya Sanyal, Yichong Xu, Shuohang Wang, Ziyi Yang, Reid Pryzant,, Wenhao Yu, Chenguang Zhu, Xiang Ren

PDF

Open Access

TL;DR

APOLLO is a straightforward adaptively pretrained language model that enhances logical reasoning by focusing on reasoning-relevant text segments and using self-supervised tasks, achieving strong results on logical reasoning benchmarks.

Contribution

It introduces a simple, task-independent pretraining method that improves logical reasoning in language models without complex data processing.

Findings

01

APOLLO performs comparably on ReClor.

02

APOLLO outperforms baselines on LogiQA.

03

The training paradigm is simple and task-independent.

Abstract

Logical reasoning of text is an important ability that requires understanding the information present in the text, their interconnections, and then reasoning through them to infer new conclusions. Prior works on improving the logical reasoning ability of language models require complex processing of training data (e.g., aligning symbolic knowledge to text), yielding task-specific data augmentation solutions that restrict the learning of general logical reasoning skills. In this work, we propose APOLLO, an adaptively pretrained language model that has improved logical reasoning abilities. We select a subset of Wikipedia, based on a set of logical inference keywords, for continued pretraining of a language model. We use two self-supervised loss functions: a modified masked language modeling loss where only specific parts-of-speech words, that would likely require more reasoning than basic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsBalanced Selection · Adaptive Parameter-wise Diagonal Quasi-Newton Method