TL;DR
This paper presents a competition on large-scale image retrieval of historical handwritten documents, highlighting challenges and the effectiveness of combined traditional methods over deep learning.
Contribution
It introduces a new benchmark dataset and evaluates traditional retrieval methods, revealing insights into the difficulty of writer retrieval, especially for letters.
Findings
Combined methods outperform single methods.
Letters are more difficult to retrieve than manuscripts.
Traditional methods dominate over deep learning approaches.
Abstract
This competition investigates the performance of large-scale retrieval of historical document images based on writing style. Based on large image data sets provided by cultural heritage institutions and digital libraries, providing a total of 20 000 document images representing about 10 000 writers, divided in three types: writers of (i) manuscript books, (ii) letters, (iii) charters and legal documents. We focus on the task of automatic image retrieval to simulate common scenarios of humanities research, such as writer retrieval. The most teams submitted traditional methods not using deep learning techniques. The competition results show that a combination of methods is outperforming single methods. Furthermore, letters are much more difficult to retrieve than manuscripts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
