A Systematic Mapping Study of Crowd Knowledge Enhanced Software Engineering Research Using Stack Overflow
Minaoar Tanzil, Shaiful Chowdhury, Somayeh Modaberi, Gias Uddin and, Hadi Hemmati

TL;DR
This systematic mapping study analyzes how Stack Overflow data has been utilized in software engineering research, highlighting trends, key domains, and future research opportunities based on 384 papers.
Contribution
It provides a comprehensive categorization and analysis of SO-based SE research, revealing dominant themes, domains, and potential impactful areas for future investigation.
Findings
SO contributes to 85% of SE research using Q&A sites
Recommender Systems and API Design are the most studied domains
Deep Learning and Code Cloning have high research impact potential
Abstract
Developers continuously interact in crowd-sourced community-based question-answer (Q&A) sites. Reportedly, 30% of all software professionals visit the most popular Q&A site StackOverflow (SO) every day. Software engineering (SE) research studies are also increasingly using SO data. To find out the trend, implication, impact, and future research potential utilizing SO data, a systematic mapping study needs to be conducted. Following a rigorous reproducible mapping study approach, from 18 reputed SE journals and conferences, we collected 384 SO-based research articles and categorized them into 10 facets (i.e., themes). We found that SO contributes to 85% of SE research compared with popular Q&A sites such as Quora, and Reddit. We found that 18 SE domains directly benefited from SO data whereas Recommender Systems, and API Design and Evolution domains use SO data the most (15% and 16% of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOpen Source Software Innovations · Technology Adoption and User Behaviour · Online Learning and Analytics
