The Socface Project: Large-Scale Collection, Processing, and Analysis of a Century of French Censuses
M\'elodie Boillet, Sol\`ene Tarride, Manon Blanco, Valentin, Rigal, Yoann Schneider, Bastien Abadie, Lionel Kesztenbaum and, Christopher Kermorvant

TL;DR
The Socface project developed a comprehensive workflow for extracting and analyzing data from a century of French census records, enabling large-scale digitization, public access, and social research.
Contribution
It introduces a novel automated handwritten table recognition system capable of processing diverse census tables at scale.
Findings
Processed over 450,000 images from French archives.
Achieved accurate extraction of individual and household data.
Enabled public access to historical census records.
Abstract
This paper presents a complete processing workflow for extracting information from French census lists from 1836 to 1936. These lists contain information about individuals living in France and their households. We aim at extracting all the information contained in these tables using automatic handwritten table recognition. At the end of the Socface project, in which our work is taking place, the extracted information will be redistributed to the departmental archives, and the nominative lists will be freely available to the public, allowing anyone to browse hundreds of millions of records. The extracted data will be used by demographers to analyze social change over time, significantly improving our understanding of French economic and social structures. For this project, we developed a complete processing workflow: large-scale data collection from French departmental archives,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques
