Augmenting the Generality and Performance of Large Language Models for Software Engineering

Fabian C. Pe\~na

arXiv:2506.11548·cs.SE·June 16, 2025

Augmenting the Generality and Performance of Large Language Models for Software Engineering

Fabian C. Pe\~na

PDF

Open Access

TL;DR

This paper explores enhancing large language models for broader software engineering tasks beyond code, focusing on understanding their capabilities, evaluating their knowledge, and detecting hallucinations to improve their utility.

Contribution

It introduces new benchmarks, evaluates diverse LLMs on non-code SE tasks, and proposes methods for hallucination detection, expanding LLM applications in software engineering.

Findings

01

Performance improvements on non-code SE tasks

02

Effective hallucination detection methods

03

Evaluation of LLMs as sources of foundational SE knowledge

Abstract

Large Language Models (LLMs) are revolutionizing software engineering (SE), with special emphasis on code generation and analysis. However, their applications to broader SE practices including conceptualization, design, and other non-code tasks, remain partially underexplored. This research aims to augment the generality and performance of LLMs for SE by (1) advancing the understanding of how LLMs with different characteristics perform on various non-code tasks, (2) evaluating them as sources of foundational knowledge in SE, and (3) effectively detecting hallucinations on SE statements. The expected contributions include a variety of LLMs trained and evaluated on domain-specific datasets, new benchmarks on foundational knowledge in SE, and methods for detecting hallucinations. Initial results in terms of performance improvements on various non-code tasks are promising.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Software System Performance and Reliability · Business Process Modeling and Analysis