On the Worst-Case Complexity of TimSort
Nicolas Auger, Vincent Jug\'e, Cyril Nicaud, Carine Pivoteau

TL;DR
This paper provides a rigorous proof of the worst-case time complexity of TimSort, clarifies differences between Python and Java implementations, and uncovers a bug in Java's version.
Contribution
It offers the first formal proof of TimSort's worst-case complexity and analyzes implementation differences, including identifying a bug in Java's version.
Findings
Python's TimSort runs in O(n log n) worst-case time.
Java's TimSort has a bug that can cause failure.
TimSort's performance depends on the number of runs, ρ.
Abstract
TimSort is an intriguing sorting algorithm designed in 2002 for Python, whose worst-case complexity was announced, but not proved until our recent preprint. In fact, there are two slightly different versions of TimSort that are currently implemented in Python and in Java respectively. We propose a pedagogical and insightful proof that the Python version runs in . The approach we use in the analysis also applies to the Java version, although not without very involved technical details. As a byproduct of our study, we uncover a bug in the Java implementation that can cause the sorting method to fail during the execution. We also give a proof that Python's TimSort running time is in , where is the number of runs (i.e. maximal monotonic sequences), which is quite a natural parameter here and part of the explanation for the good…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · semigroups and automata theory · Cellular Automata and Applications
