Draft-Refine-Optimize: Self-Evolved Learning for Natural Language to MongoDB Query Generation
Mingwei Ye, Jiaxi Zhuang, Mingjun Xu, Linfeng Zhang, Guolin Ke, Hengxing Cai

TL;DR
EvoMQL introduces a self-evolved, iterative framework for translating natural language to MongoDB queries, significantly improving accuracy by leveraging execution feedback and dynamic evidence construction.
Contribution
The paper presents EvoMQL, a novel self-evolved approach combining evidence-grounded context and execution-driven learning for NL2MQL, outperforming existing methods.
Findings
Achieved 76.6% accuracy on in-distribution benchmarks.
Attained 83.1% accuracy on out-of-distribution benchmarks.
Outperformed strong open-source baselines by up to 9.5%.
Abstract
Natural Language to MongoDB Query Language (NL2MQL) is essential for democratizing access to modern document-centric databases. Unlike Text-to-SQL, NL2MQL faces unique challenges from MQL's procedural aggregation pipelines, deeply nested schemas, and ambiguous value grounding. Existing approaches use static prompting or one-shot refinement, which inadequately model these complex contexts and fail to systematically leverage execution feedback for persistent improvement. We propose EvoMQL, a self-evolved framework that unifies evidence-grounded context construction with execution-driven learning through iterative Draft-Refine-Optimize (DRO) cycles. Each cycle uses draft queries to trigger query-aware retrieval, dynamically building compact evidence contexts that resolve schema ambiguities and ground nested paths to concrete values. The model then undergoes online policy optimization with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
