Language Models of Code are Few-Shot Commonsense Learners

Aman Madaan; Shuyan Zhou; Uri Alon; Yiming Yang; Graham Neubig

arXiv:2210.07128·cs.CL·December 7, 2022·5 cites

Language Models of Code are Few-Shot Commonsense Learners

Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, Graham Neubig

PDF

Open Access 2 Repos

TL;DR

This paper demonstrates that pre-trained code generation language models excel at structured commonsense reasoning tasks when framed as code generation, outperforming natural language models in few-shot settings.

Contribution

The paper introduces a novel approach of framing structured commonsense reasoning as code generation, leveraging code-trained LMs to improve performance over traditional natural language models.

Findings

01

Code LMs outperform natural language LMs in structured reasoning tasks.

02

Framing reasoning tasks as code generation improves few-shot learning performance.

03

Pre-trained code models generalize well to non-code reasoning tasks.

Abstract

We address the general task of structured commonsense reasoning: given a natural language input, the goal is to generate a graph such as an event -- or a reasoning-graph. To employ large language models (LMs) for this task, existing approaches ``serialize'' the output graph as a flat list of nodes and edges. Although feasible, these serialized graphs strongly deviate from the natural language corpora that LMs were pre-trained on, hindering LMs from generating them correctly. In this paper, we show that when we instead frame structured commonsense reasoning tasks as code generation tasks, pre-trained LMs of code are better structured commonsense reasoners than LMs of natural language, even when the downstream task does not involve source code at all. We demonstrate our approach across three diverse structured commonsense reasoning tasks. In all these natural language tasks, we show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Cosine Annealing · {Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Softmax · Linear Warmup With Cosine Annealing · Attention Dropout