FAT ALBERT: Finding Answers in Large Texts using Semantic Similarity Attention Layer based on BERT
Omar Mossad, Amgad Ahmed, Anandharaju Raju, Hari Karthikeyan, and, Zayed Ahmed

TL;DR
This paper introduces FAT ALBERT, a BERT-based model with a semantic similarity attention layer that improves question answering on large texts, achieving top accuracy in the MovieQA challenge.
Contribution
It presents a novel semantic similarity attention layer for BERT, enhancing its ability to handle large texts in question answering tasks.
Findings
Outperforms leading models in MovieQA challenge
Achieves 87.79% test accuracy
First place in leaderboard
Abstract
Machine based text comprehension has always been a significant research field in natural language processing. Once a full understanding of the text context and semantics is achieved, a deep learning model can be trained to solve a large subset of tasks, e.g. text summarization, classification and question answering. In this paper we focus on the question answering problem, specifically the multiple choice type of questions. We develop a model based on BERT, a state-of-the-art transformer network. Moreover, we alleviate the ability of BERT to support large text corpus by extracting the highest influence sentences through a semantic similarity model. Evaluations of our proposed model demonstrate that it outperforms the leading models in the MovieQA challenge and we are currently ranked first in the leader board with test accuracy of 87.79%. Finally, we discuss the model shortcomings and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsLinear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Softmax · Dense Connections · Linear Warmup With Linear Decay · Layer Normalization · Attention Dropout · Attention Is All You Need · Adam
