Unveiling the potential of large language models in generating semantic and cross-language clones
Palash R. Roy, Ajmain I. Alam, Farouq Al-omari, Banani Roy, Chanchal, K. Roy, Kevin A. Schneider

TL;DR
This paper evaluates GPT-3's ability to generate semantic and cross-language code clones, demonstrating promising accuracy and potential for aiding code reuse, comprehension, and refactoring across different programming languages.
Contribution
It introduces a novel evaluation of GPT-3 for semantic and cross-language clone generation using SemanticCloneBench, highlighting its strengths and challenges in this domain.
Findings
GPT-3 achieves 62.14% accuracy in semantic clone generation.
GPT-3 attains 91.25% accuracy in cross-language clone generation.
Few-shot prompt engineering enhances GPT-3's code variant quality.
Abstract
Semantic and Cross-language code clone generation may be useful for code reuse, code comprehension, refactoring and benchmarking. OpenAI's GPT model has potential in such clone generation as GPT is used for text generation. When developers copy/paste codes from Stack Overflow (SO) or within a system, there might be inconsistent changes leading to unexpected behaviours. Similarly, if someone possesses a code snippet in a particular programming language but seeks equivalent functionality in a different language, a semantic cross-language code clone generation approach could provide valuable assistance. In this study, using SemanticCloneBench as a vehicle, we evaluated how well the GPT-3 model could help generate semantic and cross-language clone variants for a given fragment.We have comprised a diverse set of code fragments and assessed GPT-3s performance in generating code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Scientific Computing and Data Management
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · 15 Ways to Contact How can i speak to someone at Delta Airlines · Cosine Annealing · Softmax · Dropout · Linear Warmup With Cosine Annealing · Layer Normalization · {Dispute@FaQ-s}How to file a dispute with Expedia? · Linear Layer
