Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu, Swaroop Mishra, Tony Xia, Liang Qiu, Kai-Wei Chang, Song-Chun, Zhu, Oyvind Tafjord, Peter Clark, Ashwin Kalyan

TL;DR
This paper introduces ScienceQA, a large multimodal science question benchmark with explanations, and demonstrates that chain-of-thought reasoning improves language model performance on science questions, especially with explanations.
Contribution
The paper presents ScienceQA, a new multimodal science question benchmark with annotations and explanations, and shows how chain-of-thought reasoning enhances model performance.
Findings
CoT improves GPT-3 few-shot accuracy by 1.20%.
Feeding explanations boosts GPT-3 performance by 18.96%.
Models benefit from explanations, achieving similar results with less data.
Abstract
When answering a question, humans utilize the information available across different modalities to synthesize a consistent and complete chain of thought (CoT). This process is normally a black box in the case of deep learning models like large-scale language models. Recently, science question benchmarks have been used to diagnose the multi-hop reasoning ability and interpretability of an AI system. However, existing datasets fail to provide annotations for the answers, or are restricted to the textual-only modality, small scales, and limited domain diversity. To this end, we present Science Question Answering (ScienceQA), a new benchmark that consists of ~21k multimodal multiple choice questions with a diverse set of science topics and annotations of their answers with corresponding lectures and explanations. We further design language models to learn to generate lectures and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Natural Language Processing Techniques
MethodsLayer Normalization · Multi-Head Attention · Cosine Annealing · Linear Warmup With Cosine Annealing · {Dispute@FaQ-s}How to file a dispute with Expedia? · Adam · Softmax · Dropout · Residual Connection · GPT-3
