SPBERT: An Efficient Pre-training BERT on SPARQL Queries for Question Answering over Knowledge Graphs
Hieu Tran, Long Phan, James Anibal, Binh T. Nguyen, and Truong-Son, Nguyen

TL;DR
SPBERT is a transformer-based model trained on SPARQL logs that improves question answering over knowledge graphs by learning representations for both natural language and SPARQL queries, achieving state-of-the-art results.
Contribution
It introduces SPBERT, a pre-trained model on SPARQL data, with novel objectives for better knowledge graph question answering.
Findings
Achieves state-of-the-art BLEU scores on QA tasks
Effective in SPARQL query construction and answer verbalization
Demonstrates the benefit of pre-training on SPARQL logs
Abstract
In this paper, we propose SPBERT, a transformer-based language model pre-trained on massive SPARQL query logs. By incorporating masked language modeling objectives and the word structural objective, SPBERT can learn general-purpose representations in both natural language and SPARQL query language. We investigate how SPBERT and encoder-decoder architecture can be adapted for Knowledge-based QA corpora. We conduct exhaustive experiments on two additional tasks, including SPARQL Query Construction and Answer Verbalization Generation. The experimental results show that SPBERT can obtain promising results, achieving state-of-the-art BLEU scores on several of these tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management · Natural Language Processing Techniques
