Simple Applications of BERT for Ad Hoc Document Retrieval
Wei Yang, Haotian Zhang, and Jimmy Lin

TL;DR
This paper demonstrates that applying BERT to individual sentences and aggregating their scores is an effective and simple method for ad hoc document retrieval, outperforming previous neural approaches on TREC datasets.
Contribution
The paper introduces a straightforward method of using BERT for document retrieval by scoring sentences individually and aggregating results, addressing input length limitations.
Findings
Achieved highest average precision on TREC microblog and newswire datasets.
Simple sentence-level BERT application outperforms more complex neural models.
Method is effective despite BERT's input length constraints.
Abstract
Following recent successes in applying BERT to question answering, we explore simple applications to ad hoc document retrieval. This required confronting the challenge posed by documents that are typically longer than the length of input BERT was designed to handle. We address this issue by applying inference on sentences individually, and then aggregating sentence scores to produce document scores. Experiments on TREC microblog and newswire test collections show that our approach is simple yet effective, as we report the highest average precision on these datasets by neural approaches that we are aware of.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Expert finding and Q&A systems
MethodsLinear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece · Softmax
