Open-Source AI-based SE Tools: Opportunities and Challenges of Collaborative Software Learning
Zhihao Lin, Wei Ma, Tao Lin, Yaowen Zheng, Jingquan Ge and, Jun Wang, Jacques Klein, Tegawende Bissyande, Yang Liu, Li Li

TL;DR
This paper explores how federated learning can enable open-source AI-based software engineering tools to access diverse data sources while respecting privacy and security, addressing key collaboration challenges.
Contribution
It introduces a federated learning governance framework and guidelines for collaborative development of AI SE tools considering data heterogeneity.
Findings
Federated learning can facilitate privacy-preserving data sharing for AI SE tools.
Data heterogeneity impacts federated learning performance in software engineering.
Guidelines improve collaboration and model maintenance in open-source AI SE projects.
Abstract
Large Language Models (LLMs) have become instrumental in advancing software engineering (SE) tasks, showcasing their efficacy in code understanding and beyond. Like traditional SE tools, open-source collaboration is key in realising the excellent products. However, with AI models, the essential need is in data. The collaboration of these AI-based SE models hinges on maximising the sources of high-quality data. However, data especially of high quality, often holds commercial or sensitive value, making it less accessible for open-source AI-based SE projects. This reality presents a significant barrier to the development and enhancement of AI-based SE tools within the software engineering community. Therefore, researchers need to find solutions for enabling open-source AI-based SE models to tap into resources by different organisations. Addressing this challenge, our position paper…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Techniques and Practices · Collaboration in agile enterprises · Business Process Modeling and Analysis
