MIA 2022 Shared Task Submission: Leveraging Entity Representations,   Dense-Sparse Hybrids, and Fusion-in-Decoder for Cross-Lingual Question   Answering

Zhucheng Tu; Sarguna Janani Padmanabhan

arXiv:2207.01940·cs.CL·July 19, 2022

MIA 2022 Shared Task Submission: Leveraging Entity Representations, Dense-Sparse Hybrids, and Fusion-in-Decoder for Cross-Lingual Question Answering

Zhucheng Tu, Sarguna Janani Padmanabhan

PDF

Open Access

TL;DR

This paper presents a two-stage cross-lingual question answering system combining hybrid retrieval methods and Fusion-in-Decoder, achieving improved F1 scores on multilingual datasets.

Contribution

It introduces a novel combination of entity-enhanced multilingual models, sparse and dense retrieval, and Fusion-in-Decoder for cross-lingual QA.

Findings

01

Achieved 43.46 F1 on XOR-TyDi QA development set.

02

Improved baseline by over 4 F1 points.

03

Demonstrated effectiveness of hybrid retrieval and Fusion-in-Decoder.

Abstract

We describe our two-stage system for the Multilingual Information Access (MIA) 2022 Shared Task on Cross-Lingual Open-Retrieval Question Answering. The first stage consists of multilingual passage retrieval with a hybrid dense and sparse retrieval strategy. The second stage consists of a reader which outputs the answer from the top passages returned by the first stage. We show the efficacy of using a multilingual language model with entity representations in pretraining, sparse retrieval signals to help dense retrieval, and Fusion-in-Decoder. On the development set, we obtain 43.46 F1 on XOR-TyDi QA and 21.99 F1 on MKQA, for an average F1 score of 32.73. On the test set, we obtain 40.93 F1 on XOR-TyDi QA and 22.29 F1 on MKQA, for an average F1 score of 31.61. We improve over the official baseline by over 4 F1 points on both the development and test sets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsTest