TL;DR
This paper systematically evaluates transfer learning and pseudo-labeling techniques with BERT-based ranking models across multiple datasets, revealing pseudo-labeling as a competitive alternative to transfer learning, with insights into stability and effectiveness issues.
Contribution
It provides a comprehensive comparison of transfer learning and pseudo-labeling for BERT-based ranking models on multiple datasets, highlighting the potential of pseudo-labeling as a practical alternative.
Findings
Pseudo-labeling can outperform transfer learning in some cases.
Few-shot fine-tuning may degrade pretrained model performance.
Full-shot evaluation improves reliability of transferability results.
Abstract
Due to high annotation costs making the best use of existing human-created training data is an important research direction. We, therefore, carry out a systematic evaluation of transferability of BERT-based neural ranking models across five English datasets. Previous studies focused primarily on zero-shot and few-shot transfer from a large dataset to a dataset with a small number of queries. In contrast, each of our collections has a substantial number of queries, which enables a full-shot evaluation mode and improves reliability of our results. Furthermore, since source datasets licences often prohibit commercial use, we compare transfer learning to training on pseudo-labels generated by a BM25 scorer. We find that training on pseudo-labels -- possibly with subsequent fine-tuning using a modest number of annotated queries -- can produce a competitive or better model compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
