Modeling Code: Is Text All You Need?

Daniel Nichols; Konstantinos Parasyris; Harshitha Menon; Brian R. Bartoldson; Giorgis Georgakoudis; Tal Ben-Nun; Abhinav Bhatele

arXiv:2507.11467·cs.AI·July 16, 2025

Modeling Code: Is Text All You Need?

Daniel Nichols, Konstantinos Parasyris, Harshitha Menon, Brian R. Bartoldson, Giorgis Georgakoudis, Tal Ben-Nun, Abhinav Bhatele

PDF

Open Access

TL;DR

This paper explores combining text-based and structured representations in code models to enhance reasoning and generative capabilities, addressing limitations of current transformer-based models in understanding code structure.

Contribution

It introduces a novel approach that integrates text and structured data modeling for code, improving reasoning and generation over existing methods.

Findings

01

Enhanced code understanding and generation capabilities

02

Improved reasoning about control and data flow in code

03

Combines strengths of text-based and structured modeling

Abstract

Code LLMs have become extremely popular recently for modeling source code across a variety of tasks, such as generation, translation, and summarization. However, transformer-based models are limited in their capabilities to reason through structured, analytical properties of code, such as control and data flow. Previous work has explored the modeling of these properties with structured data and graph neural networks. However, these approaches lack the generative capabilities and scale of modern LLMs. In this work, we introduce a novel approach to combine the strengths of modeling both code as text and more structured forms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel-Driven Software Engineering Techniques · Natural Language Processing Techniques · Engineering and Information Technology