Enhancing SPARQL Generation by Triplet-order-sensitive Pre-training

Chang Su; Jiexing Qi; He Yan; Kai Zou; Zhouhan Lin

arXiv:2410.05731·cs.IR·October 10, 2024

Enhancing SPARQL Generation by Triplet-order-sensitive Pre-training

Chang Su, Jiexing Qi, He Yan, Kai Zou, Zhouhan Lin

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new pre-training stage with Triplet Order Correction to improve the accuracy of SPARQL query generation from natural language, addressing triplet order errors and enhancing performance.

Contribution

It proposes Triplet Order Correction as an additional pre-training objective to improve SPARQL syntax sensitivity in language models.

Findings

01

Achieves state-of-the-art results on three benchmarks.

02

Reduces triplet order errors in generated SPARQL queries.

03

Enhances model sensitivity to SPARQL syntax.

Abstract

Semantic parsing that translates natural language queries to SPARQL is of great importance for Knowledge Graph Question Answering (KGQA) systems. Although pre-trained language models like T5 have achieved significant success in the Text-to-SPARQL task, their generated outputs still exhibit notable errors specific to the SPARQL language, such as triplet flips. To address this challenge and further improve the performance, we propose an additional pre-training stage with a new objective, Triplet Order Correction (TOC), along with the commonly used Masked Language Modeling (MLM), to collectively enhance the model's sensitivity to triplet order and SPARQL syntax. Our method achieves state-of-the-art performances on three widely-used benchmarks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

LUMIA-Group/TosT5
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Imaging Techniques and Applications

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Layer Normalization · Gated Linear Unit · Dense Connections · Attention Dropout · Inverse Square Root Schedule · Linear Layer · Residual Connection