Learn to Explain: Multimodal Reasoning via Thought Chains for Science   Question Answering

Pan Lu; Swaroop Mishra; Tony Xia; Liang Qiu; Kai-Wei Chang; Song-Chun; Zhu; Oyvind Tafjord; Peter Clark; Ashwin Kalyan

arXiv:2209.09513·cs.CL·October 18, 2022·214 cites

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

Pan Lu, Swaroop Mishra, Tony Xia, Liang Qiu, Kai-Wei Chang, Song-Chun, Zhu, Oyvind Tafjord, Peter Clark, Ashwin Kalyan

PDF

Open Access 1 Repo 5 Datasets 1 Video

TL;DR

This paper introduces ScienceQA, a large multimodal science question benchmark with explanations, and demonstrates that chain-of-thought reasoning improves language model performance on science questions, especially with explanations.

Contribution

The paper presents ScienceQA, a new multimodal science question benchmark with annotations and explanations, and shows how chain-of-thought reasoning enhances model performance.

Findings

01

CoT improves GPT-3 few-shot accuracy by 1.20%.

02

Feeding explanations boosts GPT-3 performance by 18.96%.

03

Models benefit from explanations, achieving similar results with less data.

Abstract

When answering a question, humans utilize the information available across different modalities to synthesize a consistent and complete chain of thought (CoT). This process is normally a black box in the case of deep learning models like large-scale language models. Recently, science question benchmarks have been used to diagnose the multi-hop reasoning ability and interpretability of an AI system. However, existing datasets fail to provide annotations for the answers, or are restricted to the textual-only modality, small scales, and limited domain diversity. To this end, we present Science Question Answering (ScienceQA), a new benchmark that consists of ~21k multimodal multiple choice questions with a diverse set of science topics and annotations of their answers with corresponding lectures and explanations. We further design language models to learn to generate lectures and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lupantech/ScienceQA
pytorchOfficial

Datasets

Videos

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering· slideslive

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Natural Language Processing Techniques

MethodsLayer Normalization · Multi-Head Attention · Cosine Annealing · Linear Warmup With Cosine Annealing · {Dispute@FaQ-s}How to file a dispute with Expedia? · Adam · Softmax · Dropout · Residual Connection · GPT-3