What Makes Code Generation Ethically Sourced?
Zhuolin Xu, Chenglin Li, Qiushi Li, Shin Hwei Tan

TL;DR
This paper introduces the concept of Ethically Sourced Code Generation (ES-CodeGen), providing a comprehensive taxonomy based on literature review and practitioner surveys to promote responsible AI practices in code generation.
Contribution
It develops a novel taxonomy of ES-CodeGen by analyzing 803 papers and surveying practitioners, highlighting the importance of ethical and sustainable practices in code generation.
Findings
Identified 11 key dimensions of ES-CodeGen including code quality.
Most practitioners overlook social dimensions despite their importance.
Survey improved practitioners' understanding of ethical sourcing in code generation.
Abstract
Several code generation models have been proposed to help reduce time and effort in solving software-related tasks. To ensure responsible AI, there are growing interests over various ethical issues (e.g., unclear licensing, privacy, fairness, and environment impact). These studies have the overarching goal of ensuring ethically sourced generation, which has gained growing attentions in speech synthesis and image generation. In this paper, we introduce the novel notion of Ethically Sourced Code Generation (ES-CodeGen) to refer to managing all processes involved in code generation model development from data collection to post-deployment via ethical and sustainable practices. To build a taxonomy of ES-CodeGen, we perform a two-phase literature review where we read 803 papers across various domains and specific to AI-based code generation. We identified 71 relevant papers with 10 initial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
