# On Code Reuse from StackOverflow: An Exploratory Study on Jupyter   Notebook

**Authors:** Mingke Yang, Yuming Zhou, Bixin Li, Yutian Tang

arXiv: 2302.11732 · 2023-02-24

## TL;DR

This study explores how data scientists reuse code snippets from Stack Overflow in Jupyter Notebooks, revealing both the prevalence of reuse and its potential negative impacts on code quality and security.

## Contribution

It provides the first large-scale analysis of code reuse practices in Jupyter Notebooks from Stack Overflow, highlighting associated quality violations and developer motivations.

## Key findings

- Identified 1,097,470 clone pairs of code reuse.
- Average code snippet has 7.91 violations.
- Insights into reasons for code reuse and its drawbacks.

## Abstract

Jupyter Notebook is a popular tool among data analysts and scientists for working with data. It provides a way to combine code, documentation, and visualizations in a single, interactive environment, facilitating code reuse. While code reuse can improve programming efficiency, it can also decrease readability, security, and overall performance. We conduct a large-scale exploratory study of code reuse practices in the Jupyter Notebook development community on the Stack Overflow platform to understand the potential negative impacts of code reuse. Our findings identified 1,097,470 Jupyter Notebook clone pairs that reuse Stack Overflow code snippets, and the average code snippet has 7.91 code quality violations. Through our research, we gain insight into the reasons behind Jupyter Notebook developers' decision to reuse code and the potential drawbacks of this practice.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.11732/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/2302.11732/full.md

## References

72 references — full list in the complete paper: https://tomesphere.com/paper/2302.11732/full.md

---
Source: https://tomesphere.com/paper/2302.11732