APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning
Soumya Sanyal, Yichong Xu, Shuohang Wang, Ziyi Yang, Reid Pryzant,, Wenhao Yu, Chenguang Zhu, Xiang Ren

TL;DR
APOLLO is a straightforward adaptively pretrained language model that enhances logical reasoning by focusing on reasoning-relevant text segments and using self-supervised tasks, achieving strong results on logical reasoning benchmarks.
Contribution
It introduces a simple, task-independent pretraining method that improves logical reasoning in language models without complex data processing.
Findings
APOLLO performs comparably on ReClor.
APOLLO outperforms baselines on LogiQA.
The training paradigm is simple and task-independent.
Abstract
Logical reasoning of text is an important ability that requires understanding the information present in the text, their interconnections, and then reasoning through them to infer new conclusions. Prior works on improving the logical reasoning ability of language models require complex processing of training data (e.g., aligning symbolic knowledge to text), yielding task-specific data augmentation solutions that restrict the learning of general logical reasoning skills. In this work, we propose APOLLO, an adaptively pretrained language model that has improved logical reasoning abilities. We select a subset of Wikipedia, based on a set of logical inference keywords, for continued pretraining of a language model. We use two self-supervised loss functions: a modified masked language modeling loss where only specific parts-of-speech words, that would likely require more reasoning than basic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsBalanced Selection · Adaptive Parameter-wise Diagonal Quasi-Newton Method
