Lessons from the German Tank Problem
George Clark, Alex Gonye, Steven J Miller

TL;DR
This paper discusses the historical German Tank Problem, deriving statistical estimates for tank production, and introduces a new generalization for cases where serial numbers don't start at 1, illustrating the practical application of statistical methods.
Contribution
It reproduces known estimates for the German Tank Problem and introduces a novel generalization for non-starting serial numbers, with educational insights on regression and functional relationships.
Findings
Derived the classic estimate for total tanks using observed serial numbers.
Compared the statistical estimate's effectiveness to intelligence gathered by spies.
Presented a new generalization for cases with arbitrary starting serial numbers.
Abstract
During World War II the German army used tanks to devastating advantage. The Allies needed accurate estimates of their tank production and deployment. They used two approaches to find these values: spies, and statistics. This note describes the statistical approach. Assuming the tanks are labeled consecutively starting at 1, if we observe serial numbers from an unknown number of tanks, with the maximum observed value , then the best estimate for is . This is now known as the German Tank Problem, and is a terrific example of the applicability of mathematics and statistics in the real world. The first part of the paper reproduces known results, specifically deriving this estimate and comparing its effectiveness to that of the spies. The second part presents a result we have not found in print elsewhere, the generalization to the case where the smallest value…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
