A Simple Method for Commonsense Reasoning
Trieu H. Trinh, Quoc V. Le

TL;DR
This paper introduces a simple, unsupervised neural network approach using language models trained on large unlabeled datasets to improve commonsense reasoning, outperforming previous methods on challenging benchmarks without relying on annotated knowledge bases.
Contribution
The paper presents a novel unsupervised method leveraging large language models for commonsense reasoning, demonstrating significant performance gains on standard benchmarks.
Findings
Outperforms previous state-of-the-art on Winograd Schema and Pronoun Disambiguation tasks.
Shows that diverse training data improves model performance.
System effectively captures important contextual features for reasoning.
Abstract
Commonsense reasoning is a long-standing challenge for deep learning. For example, it is difficult to use neural networks to tackle the Winograd Schema dataset (Levesque et al., 2011). In this paper, we present a simple method for commonsense reasoning with neural networks, using unsupervised learning. Key to our method is the use of language models, trained on a massive amount of unlabled data, to score multiple choice questions posed by commonsense reasoning tests. On both Pronoun Disambiguation and Winograd Schema challenges, our models outperform previous state-of-the-art methods by a large margin, without using expensive annotated knowledge bases or hand-engineered features. We train an array of large RNN language models that operate at word or character level on LM-1-Billion, CommonCrawl, SQuAD, Gutenberg Books, and a customized corpus for this task and show that diversity of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗FacebookAI/roberta-basemodel· 14.6M dl· ♡ 57814.6M dl♡ 578
- 🤗FacebookAI/roberta-largemodel· 20.7M dl· ♡ 27020.7M dl♡ 270
- 🤗FacebookAI/roberta-large-mnlimodel· 295k dl· ♡ 210295k dl♡ 210
- 🤗facebook/data2vec-text-basemodel· 1.4k dl· ♡ 121.4k dl♡ 12
- 🤗Giyaseddin/distilroberta-base-finetuned-short-answer-assessmentmodel· 5 dl· ♡ 15 dl♡ 1
- 🤗bhadi26/hadi-rebecca-test-model-publicmodel
- 🤗model-attribution-challenge/roberta-basemodel· 13 dl13 dl
- 🤗GuenterBlaeser/roberta-base_imdbmodel
- 🤗royleibov/roberta-base-ZipNN-Compressedmodel· 2 dl2 dl
- 🤗Vivekbala/vivek_testmodel· 1 dl· ♡ 11 dl♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
