Better Modeling the Programming World with Code Concept Graphs-augmented Multi-modal Learning
Martin Weyssow, Houari Sahraoui, Bang Liu

TL;DR
This paper proposes enhancing code modeling by integrating concept graphs with pretrained language models, aiming to leverage high-level domain relationships for improved software engineering tasks.
Contribution
It introduces a novel approach of combining concept graphs with language models to better capture relationships in code for software engineering.
Findings
Preliminary results show improved code search effectiveness.
Joint-learning with concept graphs enhances model understanding.
Encourages further exploration of multi-modal learning in code modeling.
Abstract
The progress made in code modeling has been tremendous in recent years thanks to the design of natural language processing learning approaches based on state-of-the-art model architectures. Nevertheless, we believe that the current state-of-the-art does not focus enough on the full potential that data may bring to a learning process in software engineering. Our vision articulates on the idea of leveraging multi-modal learning approaches to modeling the programming world. In this paper, we investigate one of the underlying idea of our vision whose objective based on concept graphs of identifiers aims at leveraging high-level relationships between domain concepts manipulated through particular language constructs. In particular, we propose to enhance an existing pretrained language model of code by joint-learning it with a graph neural network based on our concept graphs. We conducted a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGraph Neural Network
