Fine-tuned LLMs Know More, Hallucinate Less with Few-Shot Sequence-to-Sequence Semantic Parsing over Wikidata
Silei Xu, Shicheng Liu, Theo Culhane, Elizaveta Pertseva, Meng-Hsi Wu,, Sina J. Semnani, Monica S. Lam

TL;DR
This paper introduces a few-shot sequence-to-sequence semantic parser for Wikidata, improving factual accuracy of LLMs in question answering by grounding them with Wikidata facts, and demonstrates significant performance gains over existing methods.
Contribution
The paper presents a novel few-shot semantic parsing approach for Wikidata, fine-tunes LLaMA with this method, and achieves state-of-the-art results in factual question answering benchmarks.
Findings
Achieved 76% and 65% answer accuracy on WikiWebQuestions dev and test sets.
Paired semantic parser with GPT-3 to answer 96% of questions accurately.
Outperformed state-of-the-art on QALD-7 Wikidata dataset by 3.6% F1 score.
Abstract
While large language models (LLMs) can answer many questions correctly, they can also hallucinate and give wrong answers. Wikidata, with its over 12 billion facts, can be used to ground LLMs to improve their factuality. This paper presents WikiWebQuestions, a high-quality question answering benchmark for Wikidata. Ported over from WebQuestions for Freebase, it consists of real-world data with SPARQL annotation. This paper presents a few-shot sequence-to-sequence semantic parser for Wikidata. We modify SPARQL to use the unique domain and property names instead of their IDs. We train the parser to use either the results from an entity linker or mentions in the query. We fine-tune LLaMA by adding the few-shot training data to that used to fine-tune Alpaca. Our experimental results demonstrate the effectiveness of this methodology, establishing a strong baseline of 76% and 65% answer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Wikis in Education and Collaboration
Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Test · Sigmoid Activation · Tanh Activation · Refunds@Expedia|||How do I get a full refund from Expedia? · Long Short-Term Memory · Cosine Annealing · Softmax
