HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Grammar Generation

Weizhi Tang; Yixuan Li; Chris Sypherd; Elizabeth Polgreen; Vaishak Belle

arXiv:2505.16978·cs.AI·June 3, 2025

HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Grammar Generation

Weizhi Tang, Yixuan Li, Chris Sypherd, Elizabeth Polgreen, Vaishak Belle

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces HyGenar, a hybrid genetic algorithm driven by large language models, designed to enhance few-shot grammar generation by improving the syntactic and semantic correctness of generated grammars.

Contribution

The paper presents a novel LLM-driven hybrid genetic algorithm, HyGenar, which significantly improves grammar generation quality in few-shot learning scenarios.

Findings

01

LLMs perform poorly in grammar generation tasks.

02

HyGenar improves syntactic correctness of generated grammars.

03

HyGenar enhances semantic accuracy of generated grammars.

Abstract

Grammar plays a critical role in natural language processing and text/code generation by enabling the definition of syntax, the creation of parsers, and guiding structured outputs. Although large language models (LLMs) demonstrate impressive capabilities across domains, their ability to infer and generate grammars has not yet been thoroughly explored. In this paper, we aim to study and improve the ability of LLMs for few-shot grammar generation, where grammars are inferred from sets of a small number of positive and negative examples and generated in Backus-Naur Form. To explore this, we introduced a novel dataset comprising 540 structured grammar generation challenges, devised 6 metrics, and evaluated 8 various LLMs against it. Our findings reveal that existing LLMs perform sub-optimally in grammar generation. To address this, we propose an LLM-driven hybrid genetic algorithm, namely…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rutatang/hygenar
noneOfficial

Videos

HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Grammar Generation· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems