# Cross-Lingual Question Answering over Knowledge Base as Reading   Comprehension

**Authors:** Chen Zhang, Yuxuan Lai, Yansong Feng, Xingyu Shen, Haowei Du, Dongyan, Zhao

arXiv: 2302.13241 · 2023-02-28

## TL;DR

This paper introduces a novel cross-lingual question answering method over knowledge bases by converting KB subgraphs into passages, leveraging multilingual pre-trained models and reading comprehension techniques to improve performance across multiple languages.

## Contribution

The paper proposes a new reading comprehension-based approach for xKBQA that reduces schema-question gap and utilizes multilingual models, addressing data scarcity and schema mapping challenges.

## Key findings

- Outperforms baseline methods on two xKBQA datasets in 12 languages.
- Effective in few-shot and zero-shot learning scenarios.
- Leverages existing xMRC datasets for model fine-tuning.

## Abstract

Although many large-scale knowledge bases (KBs) claim to contain multilingual information, their support for many non-English languages is often incomplete. This incompleteness gives birth to the task of cross-lingual question answering over knowledge base (xKBQA), which aims to answer questions in languages different from that of the provided KB. One of the major challenges facing xKBQA is the high cost of data annotation, leading to limited resources available for further exploration. Another challenge is mapping KB schemas and natural language expressions in the questions under cross-lingual settings. In this paper, we propose a novel approach for xKBQA in a reading comprehension paradigm. We convert KB subgraphs into passages to narrow the gap between KB schemas and questions, which enables our model to benefit from recent advances in multilingual pre-trained language models (MPLMs) and cross-lingual machine reading comprehension (xMRC). Specifically, we use MPLMs, with considerable knowledge of cross-lingual mappings, for cross-lingual reading comprehension. Existing high-quality xMRC datasets can be further utilized to finetune our model, greatly alleviating the data scarcity issue in xKBQA. Extensive experiments on two xKBQA datasets in 12 languages show that our approach outperforms various baselines and achieves strong few-shot and zero-shot performance. Our dataset and code are released for further research.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.13241/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/2302.13241/full.md

## References

43 references — full list in the complete paper: https://tomesphere.com/paper/2302.13241/full.md

---
Source: https://tomesphere.com/paper/2302.13241