An Empirical Study on Capability of Large Language Models in   Understanding Code Semantics

Thu-Trang Nguyen; Thanh Trong Vu; Hieu Dinh Vo; Son Nguyen

arXiv:2407.03611·cs.SE·July 8, 2024

An Empirical Study on Capability of Large Language Models in Understanding Code Semantics

Thu-Trang Nguyen, Thanh Trong Vu, Hieu Dinh Vo, Son Nguyen

PDF

Open Access

TL;DR

This study systematically evaluates large language models' ability to understand code semantics by testing their robustness and sensitivity to code transformations across multiple software engineering tasks.

Contribution

Introduces EMPICA, a framework for empirically assessing code LLMs' semantic understanding through controlled code modifications.

Findings

01

Models are more robust to semantic-preserving transformations.

02

Sensitivity to non-semantic-preserving transformations varies across tasks.

03

Significant gaps remain in models' understanding of code semantics.

Abstract

Large Language Models for Code (code LLMs) have demonstrated remarkable performance across various software engineering (SE) tasks, increasing the application of code LLMs in software development. Despite the success of code LLMs, there remain significant concerns about the actual capabilities and reliability of these models, "whether these models really learn the semantics of code from the training data and leverage the learned knowledge to perform the SE tasks". In this paper, we introduce EMPICA, a comprehensive framework designed to systematically and empirically evaluate the capabilities of code LLMs in understanding code semantics. Specifically, EMPICA systematically introduces controlled modifications/transformations into the input code and examines the models' responses. Generally, code LLMs must be robust to semantically equivalent code inputs and be sensitive to non-equivalent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Semantic Web and Ontologies · Natural Language Processing Techniques