TL;DR
This study attempts to reproduce and evaluate the BERT-PLI model for cross-domain legal and patent document retrieval, revealing limited performance gains and emphasizing the importance of reproducibility and transparency in domain-specific IR research.
Contribution
The paper provides a detailed reproducibility analysis of BERT-PLI, clarifies experimental procedures, and explores cross-domain transfer, highlighting challenges and potential in legal and patent retrieval tasks.
Findings
Reproducing BERT-PLI was challenging due to missing details.
Domain-specific paragraph modeling did not outperform original BERT.
Cross-domain transfer shows promising results at the document level.
Abstract
Domain specific search has always been a challenging information retrieval task due to several challenges such as the domain specific language, the unique task setting, as well as the lack of accessible queries and corresponding relevance judgements. In the last years, pretrained language models, such as BERT, revolutionized web and news search. Naturally, the community aims to adapt these advancements to cross-domain transfer of retrieval models for domain specific search. In the context of legal document retrieval, Shao et al. propose the BERT-PLI framework by modeling the Paragraph Level Interactions with the language model BERT. In this paper we reproduce the original experiments, we clarify pre-processing steps, add missing scripts for framework steps and investigate different evaluation approaches, however we are not able to reproduce the evaluation results. Contrary to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Linear Warmup With Linear Decay · Attention Dropout · Adam · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Dense Connections · Weight Decay · WordPiece · Multi-Head Attention
