D\'etection d'Objets dans les documents num\'eris\'es par r\'eseaux de neurones profonds
M\'elodie Boillet

TL;DR
This thesis develops two deep neural network models for document object detection, addressing challenges like limited training data and domain adaptation, with high generalization and efficient annotation strategies.
Contribution
It introduces a pixel-level and a Transformer-based object detection model tailored for document analysis, with strategies for data collection, model efficiency, and minimal annotation for adaptation.
Findings
High generalization to out-of-sample documents
Fast models with few parameters
Effective sample selection reduces annotation effort
Abstract
In this thesis, we study multiple tasks related to document layout analysis such as the detection of text lines, the splitting into acts or the detection of the writing support. Thus, we propose two deep neural models following two different approaches. We aim at proposing a model for object detection that considers the difficulties associated with document processing, including the limited amount of training data available. In this respect, we propose a pixel-level detection model and a second object-level detection model. We first propose a detection model with few parameters, fast in prediction, and which can obtain accurate prediction masks from a reduced number of training data. We implemented a strategy of collection and uniformization of many datasets, which are used to train a single line detection model that demonstrates high generalization capabilities to out-of-sample…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques
