I Can't Share Code, but I need Translation -- An Empirical Study on Code Translation through Federated LLM
Jahnavi Kumar, Venkata Lakshmana Sasaank Janapati, Mokshith Reddy, Tanguturi, Sridhar Chimalakonda

TL;DR
This paper presents an empirical study on federated learning for code translation using large language models, demonstrating improved translation accuracy while preserving data privacy.
Contribution
It introduces a novel federated LLM approach for code translation, enabling collaborative training without sharing sensitive code data.
Findings
Over 40% improvement in CodeBLEU score with FedLLM
Effective translation between C# and Java
Federated approach preserves data privacy while enhancing performance
Abstract
Owing to the rapid evolution of technologies and project requirements, organizations need to upgrade the code base in their software projects to a new version of the programming language or even translating to an entirely new one. However, code translation is resource-intensive and requires expertise in both the source and target languages. While researchers have made progress in automating translations between legacy and modern languages, recent work has increasingly turned to pre-trained Large Language Models (LLMs) to translate efficiently. Given the proprietary nature of code, organizations prefer fine-tuning LLMs locally rather than relying on external APIs. This is one of the first empirical studies that proposes a Federated LLM-based approach for code translation. The proposed approach enables clients to jointly train a code translator without sharing sensitive data. This study…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Rights Management and Security
