TL;DR
This paper explores the use of BERT, a pre-trained transformer model, for answer selection in question answering and community question answering tasks, showing significant performance improvements over previous methods.
Contribution
It demonstrates the effectiveness of fine-tuning BERT for answer selection, achieving state-of-the-art results on multiple datasets.
Findings
Maximum 13.1% improvement in QA datasets
Maximum 18.7% improvement in CQA datasets
Fine-tuning BERT is highly effective for answer selection
Abstract
Pre-training a transformer-based model for the language modeling task in a large dataset and then fine-tuning it for downstream tasks has been found very useful in recent years. One major advantage of such pre-trained language models is that they can effectively absorb the context of each word in a sentence. However, for tasks such as the answer selection task, the pre-trained language models have not been extensively used yet. To investigate their effectiveness in such tasks, in this paper, we adopt the pre-trained Bidirectional Encoder Representations from Transformer (BERT) language model and fine-tune it on two Question Answering (QA) datasets and three Community Question Answering (CQA) datasets for the answer selection task. We find that fine-tuning the BERT model for the answer selection task is very effective and observe a maximum improvement of 13.1% in the QA datasets and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Dropout · Softmax · Multi-Head Attention · Attention Dropout · Residual Connection · Dense Connections
