Utilizing Bidirectional Encoder Representations from Transformers for   Answer Selection

Md Tahmid Rahman Laskar; Enamul Hoque; Jimmy Xiangji Huang

arXiv:2011.07208·cs.CL·November 17, 2020

Utilizing Bidirectional Encoder Representations from Transformers for Answer Selection

Md Tahmid Rahman Laskar, Enamul Hoque, Jimmy Xiangji Huang

PDF

1 Repo

TL;DR

This paper explores the use of BERT, a pre-trained transformer model, for answer selection in question answering and community question answering tasks, showing significant performance improvements over previous methods.

Contribution

It demonstrates the effectiveness of fine-tuning BERT for answer selection, achieving state-of-the-art results on multiple datasets.

Findings

01

Maximum 13.1% improvement in QA datasets

02

Maximum 18.7% improvement in CQA datasets

03

Fine-tuning BERT is highly effective for answer selection

Abstract

Pre-training a transformer-based model for the language modeling task in a large dataset and then fine-tuning it for downstream tasks has been found very useful in recent years. One major advantage of such pre-trained language models is that they can effectively absorb the context of each word in a sentence. However, for tasks such as the answer selection task, the pre-trained language models have not been extensively used yet. To investigate their effectiveness in such tasks, in this paper, we adopt the pre-trained Bidirectional Encoder Representations from Transformer (BERT) language model and fine-tune it on two Question Answering (QA) datasets and three Community Question Answering (CQA) datasets for the answer selection task. We find that fine-tuning the BERT model for the answer selection task is very effective and observe a maximum improvement of 13.1% in the QA datasets and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tahmedge/BERT-for-Answer-Selection
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Dropout · Softmax · Multi-Head Attention · Attention Dropout · Residual Connection · Dense Connections