PELLI: Framework to effectively integrate LLMs for quality software generation

Rasmus Krebs; Somnath Mazumdar

arXiv:2602.10808·cs.SE·February 12, 2026

PELLI: Framework to effectively integrate LLMs for quality software generation

Rasmus Krebs, Somnath Mazumdar

PDF

Open Access

TL;DR

This paper introduces PELLI, a comprehensive framework for evaluating and integrating LLMs into software development by assessing multiple nonfunctional quality metrics across different domains.

Contribution

The paper presents PELLI, a novel iterative assessment framework that evaluates LLM-generated code on maintainability, performance, and reliability, extending prior work focused mainly on reliability.

Findings

01

GPT-4T and Gemini performed slightly better across metrics.

02

Prompt design significantly influences code quality.

03

Application domains show varied scores across metrics.

Abstract

Recent studies have revealed that when LLMs are appropriately prompted and configured, they demonstrate mixed results. Such results often meet or exceed the baseline performance. However, these comparisons have two primary issues. First, they mostly considered only reliability as a comparison metric and selected a few LLMs (such as Codex and ChatGPT) for comparision. This paper proposes a comprehensive code quality assessment framework called Programmatic Excellence via LLM Iteration (PELLI). PELLI is an iterative analysis-based process that upholds high-quality code changes. We extended the state-of-the-art by performing a comprehensive evaluation that generates quantitative metrics for analyzing three primary nonfunctional requirements (such as maintainability, performance, and reliability) while selecting five popular LLMs. For PELLI's applicability, we selected three application…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Artificial Intelligence in Healthcare and Education · Software Engineering Techniques and Practices