Teaching Statistics at Google Scale
Nicholas Chamandy, Omkar Muralidharan, Stefan Wager

TL;DR
This paper discusses how teaching classical statistical concepts is crucial for solving large-scale data problems at Google, illustrated through three industrial applications that demonstrate modern data challenges and solutions.
Contribution
It introduces the importance of classical statistical training in modern data science education, supported by three real-world Google-scale applications.
Findings
Statistical thinking effectively addresses massive data challenges.
Classical concepts remain central in modern data science.
Real-world applications demonstrate practical benefits.
Abstract
Modern data and applications pose very different challenges from those of the 1950s or even the 1980s. Students contemplating a career in statistics or data science need to have the tools to tackle problems involving massive, heavy-tailed data, often interacting with live, complex systems. However, despite the deepening connections between engineering and modern data science, we argue that training in classical statistical concepts plays a central role in preparing students to solve Google-scale problems. To this end, we present three industrial applications where significant modern data challenges were overcome by statistical thinking.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistics Education and Methodologies · Evolutionary Algorithms and Applications · Genetics, Bioinformatics, and Biomedical Research
