Detecting Figures and Part Labels in Patents: Competition-Based Development of Image Processing Algorithms
Christoph Riedl, Richard Zanibbi, Marti A. Hearst, Siyu Zhu, Michael, Menietti, Jason Crusan, Ivan Metelsky, Karim R. Lakhani

TL;DR
This paper presents the results of a competitive challenge to develop algorithms for detecting figures and labels in patent images, achieving near state-of-the-art accuracy under strict constraints.
Contribution
It introduces a large labeled dataset and benchmark for patent figure detection, along with the top algorithms' performance and insights from a competitive development process.
Findings
Top system achieved 88.57% f-measure for figure detection
70.98% f-measure for part label recognition
Competition data and solutions are publicly available
Abstract
We report the findings of a month-long online competition in which participants developed algorithms for augmenting the digital version of patent documents published by the United States Patent and Trademark Office (USPTO). The goal was to detect figures and part labels in U.S. patent drawing pages. The challenge drew 232 teams of two, of which 70 teams (30%) submitted solutions. Collectively, teams submitted 1,797 solutions that were compiled on the competition servers. Participants reported spending an average of 63 hours developing their solutions, resulting in a total of 5,591 hours of development time. A manually labeled dataset of 306 patents was used for training, online system tests, and evaluation. The design and performance of the top-5 systems are presented, along with a system developed after the competition which illustrates that winning teams produced near state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
