ConCodeEval: Evaluating Large Language Models for Code Constraints in   Domain-Specific Languages

Mehant Kammakomati; Sameer Pimparkhede; Srikanth Tamilselvam; Prince; Kumar; Pushpak Bhattacharyya

arXiv:2407.03387·cs.SE·March 25, 2025

ConCodeEval: Evaluating Large Language Models for Code Constraints in Domain-Specific Languages

Mehant Kammakomati, Sameer Pimparkhede, Srikanth Tamilselvam, Prince, Kumar, Pushpak Bhattacharyya

PDF

Open Access

TL;DR

ConCodeEval is a benchmark designed to evaluate large language models' ability to understand and adhere to code constraints in domain-specific languages like JSON and YAML, revealing significant challenges in controllability.

Contribution

This work introduces the first benchmark for assessing LLMs' understanding of code constraints in domain-specific languages across multiple representations.

Findings

01

LLMs struggle with code constraints in DSLs.

02

High performance in normal code tasks does not translate to constraint understanding.

03

LLMs show limited controllability over code constraints.

Abstract

Recent work shows Large Language Models (LLMs) struggle to understand natural language constraints for various text generation tasks in zero- and few-shot settings. While, in the code domain, there is wide usage of constraints in code format to maintain the integrity of code written in Domain-Specific Languages (DSLs) like JSON and YAML which are widely used for system-level programming tasks in enterprises. Given that LLMs are increasingly used for system-level code tasks, evaluating if they can comprehend these code constraints is crucial. However, no work has been done to evaluate their controllability over code constraints. Hence, we introduce ConCodeEval, a first-of-its-kind benchmark having two novel tasks for code constraints across five representations. Our findings suggest that language models struggle with code constraints. Code languages that perform excellently for normal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling