PathOCL: Path-Based Prompt Augmentation for OCL Generation with GPT-4
Seif Abukhalaf, Mohammad Hamdaqa, Foutse Khomh

TL;DR
PathOCL is a novel prompt augmentation method that improves OCL generation from UML models by effectively managing prompt size and relevance, enabling GPT-4 to produce more valid constraints on large models.
Contribution
Introducing PathOCL, a path-based prompt augmentation technique that enhances OCL generation from UML models by addressing token limits and relevance filtering.
Findings
PathOCL generates more valid and correct OCL constraints than full UML augmentation.
Prompt size with PathOCL decreases as UML model size increases.
PathOCL improves GPT-4's ability to handle large UML models for OCL generation.
Abstract
The rapid progress of AI-powered programming assistants, such as GitHub Copilot, has facilitated the development of software applications. These assistants rely on large language models (LLMs), which are foundation models (FMs) that support a wide range of tasks related to understanding and generating language. LLMs have demonstrated their ability to express UML model specifications using formal languages like the Object Constraint Language (OCL). However, the context size of the prompt is limited by the number of tokens an LLM can process. This limitation becomes significant as the size of UML class models increases. In this study, we introduce PathOCL, a novel path-based prompt augmentation technique designed to facilitate OCL generation. PathOCL addresses the limitations of LLMs, specifically their token processing limit and the challenges posed by large UML class models. PathOCL is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Logic, programming, and type systems
MethodsDense Connections · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Residual Connection · Absolute Position Encodings · Byte Pair Encoding · Adam · Dropout · Softmax
