Advancing Transformer's Capabilities in Commonsense Reasoning

Yu Zhou; Yunqiu Han; Hanyu Zhou; Yulun Wu

arXiv:2310.06803·cs.CL·October 11, 2023

Advancing Transformer's Capabilities in Commonsense Reasoning

Yu Zhou, Yunqiu Han, Hanyu Zhou, Yulun Wu

PDF

Open Access 1 Repo

TL;DR

This paper enhances pre-trained language models for commonsense reasoning by integrating ML techniques like knowledge transfer, ensemble methods, and contrastive learning, achieving significant performance improvements on benchmark datasets.

Contribution

Introduces a systematic evaluation of ML-based methods to improve commonsense reasoning in pre-trained models, surpassing previous state-of-the-art results.

Findings

01

15% absolute gain in Pairwise Accuracy

02

8.7% absolute gain in Standard Accuracy

03

Effective combination of knowledge transfer, ensemble, and contrastive objectives

Abstract

Recent advances in general purpose pre-trained language models have shown great potential in commonsense reasoning. However, current works still perform poorly on standard commonsense reasoning benchmarks including the Com2Sense Dataset. We argue that this is due to a disconnect with current cutting-edge machine learning methods. In this work, we aim to bridge the gap by introducing current ML-based methods to improve general purpose pre-trained language models in the task of commonsense reasoning. Specifically, we experiment with and systematically evaluate methods including knowledge transfer, model ensemble, and introducing an additional pairwise contrastive objective. Our best model outperforms the strongest previous works by ~15\% absolute gains in Pairwise Accuracy and ~8.7\% absolute gains in Standard Accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bryanzhou008/advancing-commonsense-reasoning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies