MissMecha: An All-in-One Python Package for Studying Missing Data Mechanisms
Youran Zhou, Mohamed Reda Bouadjenek, Sunil Aryal

TL;DR
MissMecha is a comprehensive Python toolkit that enables simulation, visualization, and evaluation of missing data mechanisms in mixed-type tabular datasets, facilitating research and education on data quality issues.
Contribution
It introduces a unified, open-source platform supporting mechanism-aware missing data simulation, diagnostics, and imputation evaluation for both numerical and categorical data.
Findings
Supports MCAR, MAR, MNAR mechanisms
Includes visual diagnostics and testing utilities
Enables benchmarking and educational use
Abstract
Incomplete data is a persistent challenge in real-world datasets, often governed by complex and unobservable missing mechanisms. Simulating missingness has become a standard approach for understanding its impact on learning and analysis. However, existing tools are fragmented, mechanism-limited, and typically focus only on numerical variables, overlooking the heterogeneous nature of real-world tabular data. We present MissMecha, an open-source Python toolkit for simulating, visualizing, and evaluating missing data under MCAR, MAR, and MNAR assumptions. MissMecha supports both numerical and categorical features, enabling mechanism-aware studies across mixed-type tabular datasets. It includes visual diagnostics, MCAR testing utilities, and type-aware imputation evaluation metrics. Designed to support data quality research, benchmarking, and education,MissMecha offers a unified platform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
