Morescient GAI for Software Engineering (Extended Version)
Marcus Kessel, Colin Atkinson

TL;DR
This paper proposes a new class of Generative AI models, called 'Morescient', designed to understand both the semantics and static aspects of software, aiming to improve trustworthiness in software engineering tasks.
Contribution
It introduces the concept of 'Morescient' GAI models that incorporate semantic awareness and outlines a roadmap for their development and open dissemination.
Findings
Identifies limitations of current code models trained only on syntax.
Proposes a new class of models trained on both semantics and static facets.
Suggests a new platform for generating structured execution observations.
Abstract
The ability of Generative AI (GAI) technology to automatically check, synthesize and modify software engineering artifacts promises to revolutionize all aspects of software engineering. Using GAI for software engineering tasks is consequently one of the most rapidly expanding fields of software engineering research, with over a hundred LLM-based code models having been published since 2021. However, the overwhelming majority of existing code models share a major weakness - they are exclusively trained on the syntactic facet of software, significantly lowering their trustworthiness in tasks dependent on software semantics. To address this problem, a new class of "Morescient" GAI is needed that is "aware" of (i.e., trained on) both the semantic and static facets of software. This, in turn, will require a new generation of software observation platforms capable of generating large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems
