D\'etection d'Objets dans les documents num\'eris\'es par r\'eseaux de   neurones profonds

M\'elodie Boillet

arXiv:2301.11753·cs.CV·January 30, 2023

D\'etection d'Objets dans les documents num\'eris\'es par r\'eseaux de neurones profonds

M\'elodie Boillet

PDF

Open Access

TL;DR

This thesis develops two deep neural network models for document object detection, addressing challenges like limited training data and domain adaptation, with high generalization and efficient annotation strategies.

Contribution

It introduces a pixel-level and a Transformer-based object detection model tailored for document analysis, with strategies for data collection, model efficiency, and minimal annotation for adaptation.

Findings

01

High generalization to out-of-sample documents

02

Fast models with few parameters

03

Effective sample selection reduces annotation effort

Abstract

In this thesis, we study multiple tasks related to document layout analysis such as the detection of text lines, the splitting into acts or the detection of the writing support. Thus, we propose two deep neural models following two different approaches. We aim at proposing a model for object detection that considers the difficulties associated with document processing, including the limited amount of training data available. In this respect, we propose a pixel-level detection model and a second object-level detection model. We first propose a detection model with few parameters, fast in prediction, and which can obtain accurate prediction masks from a reduced number of training data. We implemented a strategy of collection and uniformization of many datasets, which are used to train a single line detection model that demonstrates high generalization capabilities to out-of-sample…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques