TL;DR
This paper introduces a novel multistage neural encoder system for multilingual COVID-19 information retrieval, combining traditional and transformer-based models to improve accuracy and effectiveness across languages.
Contribution
It proposes a three-stage ranking pipeline integrating BM25, bi-encoder, and cross-encoder models for multilingual semantic search, outperforming existing methods.
Findings
Outperforms state-of-the-art approaches in multilingual COVID-19 information retrieval
Demonstrates high precision and recall in ranking relevant documents
Validated through participation in MLIA shared task with superior results
Abstract
The Coronavirus (COVID-19) pandemic has led to a rapidly growing 'infodemic' of health information online. This has motivated the need for accurate semantic search and retrieval of reliable COVID-19 information across millions of documents, in multiple languages. To address this challenge, this paper proposes a novel high precision and high recall neural Multistage BiCross encoder approach. It is a sequential three-stage ranking pipeline which uses the Okapi BM25 retrieval algorithm and transformer-based bi-encoder and cross-encoder to effectively rank the documents with respect to the given query. We present experimental results from our participation in the Multilingual Information Access (MLIA) shared task on COVID-19 multilingual semantic search. The independently evaluated MLIA results validate our approach and demonstrate that it outperforms other state-of-the-art approaches…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
