Kajal: Extracting Grammar of a Source Code Using Large Language Models
Mohammad Jalili Torkamani

TL;DR
Kajal leverages large language models with prompt engineering to automatically infer domain-specific language grammars from code snippets, significantly reducing manual effort and improving accuracy.
Contribution
This paper introduces Kajal, a novel method that uses LLMs and few-shot learning to automatically extract DSL grammars from code, enhancing automation in software engineering tasks.
Findings
Kajal achieves 60% accuracy with few-shot learning.
Accuracy drops to 45% without few-shot learning.
Few-shot learning significantly improves grammar extraction performance.
Abstract
Understanding and extracting the grammar of a domain-specific language (DSL) is crucial for various software engineering tasks; however, manually creating these grammars is time-intensive and error-prone. This paper presents Kajal, a novel approach that automatically infers grammar from DSL code snippets by leveraging Large Language Models (LLMs) through prompt engineering and few-shot learning. Kajal dynamically constructs input prompts, using contextual information to guide the LLM in generating the corresponding grammars, which are iteratively refined through a feedback-driven approach. Our experiments show that Kajal achieves 60% accuracy with few-shot learning and 45% without it, demonstrating the significant impact of few-shot learning on the tool's effectiveness. This approach offers a promising solution for automating DSL grammar extraction, and future work will explore using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis
