Dev2vec: Representing Domain Expertise of Developers in an Embedding Space
Arghavan Moradi Dakhel, Michel C. Desmarais, Foutse Khomh

TL;DR
This paper introduces dev2vec, a method that uses embedding vectors derived from various developer activity sources to accurately represent and assess developers' domain expertise across multiple projects.
Contribution
The paper presents dev2vec, a novel approach employing doc2vec to encode developers' expertise from diverse sources, outperforming existing methods in representing technical specialization.
Findings
Encoding developer expertise as embeddings improves F1-score by up to 21%.
Issue resolving history is the most informative source for expertise representation.
Dev2vec outperforms state-of-the-art methods in accuracy.
Abstract
Accurate assessment of the domain expertise of developers is important for assigning the proper candidate to contribute to a project or to attend a job role. Since the potential candidate can come from a large pool, the automated assessment of this domain expertise is a desirable goal. While previous methods have had some success within a single software project, the assessment of a developer's domain expertise from contributions across multiple projects is more challenging. In this paper, we employ doc2vec to represent the domain expertise of developers as embedding vectors. These vectors are derived from different sources that contain evidence of developers' expertise, such as the description of repositories that they contributed, their issue resolving history, and API calls in their commits. We name it dev2vec and demonstrate its effectiveness in representing the technical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Open Source Software Innovations
