Callico: a Versatile Open-Source Document Image Annotation Platform
Christopher Kermorvant, Eva Bardou, Manon Blanco, Bastien Abadie

TL;DR
Callico is an open-source, web-based platform that streamlines document image annotation for machine learning, supporting various annotation types and collaborative workflows to improve data quality in document recognition tasks.
Contribution
It introduces a versatile, open-source annotation platform with dual-display support and collaborative features tailored for diverse document recognition applications.
Findings
Supports multiple annotation types including layout and key-value annotations.
Enables collaborative annotation workflows.
Proven effective in real-world projects like municipal records and census data.
Abstract
This paper presents Callico, a web-based open source platform designed to simplify the annotation process in document recognition projects. The move towards data-centric AI in machine learning and deep learning underscores the importance of high-quality data, and the need for specialised tools that increase the efficiency and effectiveness of generating such data. For document image annotation, Callico offers dual-display annotation for digitised documents, enabling simultaneous visualisation and annotation of scanned images and text. This capability is critical for OCR and HTR model training, document layout analysis, named entity recognition, form-based key value annotation or hierarchical structure annotation with element grouping. The platform supports collaborative annotation with versatile features backed by a commitment to open source development, high-quality code standards and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Handwritten Text Recognition Techniques
