Minimalist Data Wrangling with Python
Marek Gagolewski

TL;DR
This paper introduces a comprehensive, beginner-friendly guide to data wrangling in Python, covering essential techniques for cleaning, transforming, analyzing, and visualizing data, aimed at students and newcomers.
Contribution
It provides an accessible, high-level overview of data science concepts with practical Python methods, serving as an educational resource for beginners.
Findings
Effective data cleaning and transformation techniques
Methods for exploratory data analysis and clustering
Guidance on pattern modeling and reporting
Abstract
Minimalist Data Wrangling with Python is envisaged as a student's first introduction to data science, providing a high-level overview as well as discussing key concepts in detail. We explore methods for cleaning data gathered from different sources, transforming, selecting, and extracting features, performing exploratory data analysis and dimensionality reduction, identifying naturally occurring data clusters, modelling patterns in data, comparing data between groups, and reporting the results. This textbook is a non-profit project. Its online and PDF versions are freely available at https://datawranglingpy.gagolewski.com/.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications
