MultiTEND: A Multilingual Benchmark for Natural Language to NoSQL Query Translation
Zhiqian Qin, Yuanfeng Song, Jinwei Lu, Yuanwei Song, Shuaimin Li, Chen, Jason Zhang

TL;DR
This paper presents MultiTEND, a comprehensive multilingual benchmark for translating natural language to NoSQL queries across six languages, and introduces MultiLink, a framework that significantly improves translation accuracy by addressing linguistic challenges.
Contribution
The paper introduces MultiTEND, the first large-scale multilingual benchmark for NL to NoSQL translation, and proposes MultiLink, a novel framework that enhances multilingual query generation accuracy.
Findings
Performance gaps of 4%-6% between English and non-English languages.
MultiLink improves accuracy by about 15% in English and 10% on average for other languages.
MultiTEND reveals diverse linguistic challenges in NL to NoSQL translation.
Abstract
Natural language interfaces for NoSQL databases are increasingly vital in the big data era, enabling users to interact with complex, unstructured data without deep technical expertise. However, most recent advancements focus on English, leaving a gap for multilingual support. This paper introduces MultiTEND, the first and largest multilingual benchmark for natural language to NoSQL query generation, covering six languages: English, German, French, Russian, Japanese and Mandarin Chinese. Using MultiTEND, we analyze challenges in translating natural language to NoSQL queries across diverse linguistic structures, including lexical and syntactic differences. Experiments show that performance accuracy in both English and non-English settings remains relatively low, with a 4%-6% gap across scenarios like fine-tuned SLM, zero-shot LLM, and RAG for LLM. To address the aforementioned challenges,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Service-Oriented Architecture and Web Services · Semantic Web and Ontologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Adam · Softmax · Dropout · Weight Decay · BART · WordPiece · Layer Normalization
