How Additional Knowledge can Improve Natural Language Commonsense Question Answering?
Arindam Mitra, Pratyay Banerjee, Kuntal Kumar Pal, Swaroop Mishra and, Chitta Baral

TL;DR
This paper investigates how integrating external commonsense knowledge sources can enhance the performance of language models in answering questions that require commonsense reasoning.
Contribution
It categorizes external knowledge sources and evaluates three strategies for incorporating this knowledge into language models for question answering.
Findings
Knowledge integration improves QA performance
Different strategies have varying effectiveness
Analysis suggests potential for further improvements
Abstract
Recently several datasets have been proposed to encourage research in Question Answering domains where commonsense knowledge is expected to play an important role. Recent language models such as ROBERTA, BERT and GPT that have been pre-trained on Wikipedia articles and books have shown reasonable performance with little fine-tuning on several such Multiple Choice Question-Answering (MCQ) datasets. Our goal in this work is to develop methods to incorporate additional (commonsense) knowledge into language model-based approaches for better question-answering in such domains. In this work, we first categorize external knowledge sources, and show performance does improve on using such sources. We then explore three different strategies for knowledge incorporation and four different models for question-answering using external commonsense knowledge. We analyze our predictions to explore the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsLinear Layer · Cosine Annealing · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Linear Warmup With Cosine Annealing · Byte Pair Encoding · Dense Connections · Weight Decay · WordPiece
