Stack Overflow Meets Replication: Security Research Amid Evolving Code Snippets (Extended Version)
Alfusainey Jallow, Sven Bugiel

TL;DR
This paper investigates how the evolving nature of Stack Overflow code snippets affects the stability of research findings, emphasizing the importance of considering temporal dynamics in future studies.
Contribution
It systematically analyzes the impact of code evolution on research reproducibility and provides guidelines for incorporating time series analysis in Stack Overflow-based studies.
Findings
Four out of six replicated studies showed significantly different results with newer data.
Certain code snippet aspects are non-stationary over time, affecting research conclusions.
Recommendations are provided for treating Stack Overflow data as a time series source.
Abstract
We study the impact of Stack Overflow code evolution on the stability of prior research findings derived from Stack Overflow data and provide recommendations for future studies. We systematically reviewed papers published between 2005--2023 to identify key aspects of Stack Overflow that can affect study results, such as the language or context of code snippets. Our analysis reveals that certain aspects are non-stationary over time, which could lead to different conclusions if experiments are repeated at different times. We replicated six studies using a more recent dataset to demonstrate this risk. Our findings show that four papers produced significantly different results than the original findings, preventing the same conclusions from being drawn with a newer dataset version. Consequently, we recommend treating Stack Overflow as a time series data source to provide context for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation and Cyber Security · Web Application Security Vulnerabilities · Digital and Cyber Forensics
