# R-miss-tastic: a unified platform for missing values methods and   workflows

**Authors:** Imke Mayer, Aude Sportisse, Julie Josse, Nicholas Tierney, Nathalie, Vialaneix

arXiv: 1908.04822 · 2024-06-18

## TL;DR

R-miss-tastic is a comprehensive platform that consolidates methods, workflows, and educational resources for handling missing data in statistical analysis, facilitating better decision-making and reproducibility.

## Contribution

It provides a unified, systematic overview of missing data methods, workflows, and implementations in R and Python, including educational and reproducible analysis pipelines.

## Key findings

- Developed standardized analysis workflows for missing data
- Implemented pipelines in R and Python for various statistical tasks
- Organized extensive resources including bibliography, tutorials, and didactic materials

## Abstract

Missing values are unavoidable when working with data. Their occurrence is exacerbated as more data from different sources become available. However, most statistical models and visualization methods require complete data, and improper handling of missing data results in information loss or biased analyses. Since the seminal work of Rubin (1976), a burgeoning literature on missing values has arisen, with heterogeneous aims and motivations. This led to the development of various methods, formalizations, and tools. For practitioners, it remains nevertheless challenging to decide which method is most suited for their problem, partially due to a lack of systematic covering of this topic in statistics or data science curricula.   To help address this challenge, we have launched the "R-miss-tastic" platform, which aims to provide an overview of standard missing values problems, methods, and relevant implementations of methodologies. Beyond gathering and organizing a large majority of the material on missing data (bibliography, courses, tutorials, implementations), "R-miss-tastic" covers the development of standardized analysis workflows. Indeed, we have developed several pipelines in R and Python to allow for hands-on illustration of and recommendations on missing values handling in various statistical tasks such as matrix completion, estimation and prediction, while ensuring reproducibility of the analyses. Finally, the platform is dedicated to users who analyze incomplete data, researchers who want to compare their methods and search for an up-to-date bibliography, and also teachers who are looking for didactic materials (notebooks, video, slides).

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.04822/full.md

## Figures

23 figures with captions in the complete paper: https://tomesphere.com/paper/1908.04822/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/1908.04822/full.md

---
Source: https://tomesphere.com/paper/1908.04822