Introducing a new high-resolution handwritten digits data set with writer characteristics
C\'edric Beaulac, Jeffrey S. Rosenthal

TL;DR
This paper introduces a high-resolution handwritten digit dataset with writer characteristics, enabling new research avenues in classification, semi-supervised learning, and style generation, and provides initial benchmarks and analyses.
Contribution
It presents a novel high-resolution handwritten digit dataset with writer characteristics, not available in MNIST, and analyzes its potential for classification, semi-supervised learning, and image generation.
Findings
Predictability of writer characteristics assessed
Higher resolution improves classification accuracy
Semi-supervised methods enhance performance
Abstract
The contributions in this article are two-fold. First, we introduce a new hand-written digit data set that we collected. It contains high-resolution images of hand-written The contributions in this article are two-fold. First, we introduce a new handwritten digit data set that we collected. It contains high-resolution images of handwritten digits together with various writer characteristics which are not available in the well-known MNIST database. The multiple writer characteristics gathered are a novelty of our data set and create new research opportunities. The data set is publicly available online. Second, we analyse this new data set. We begin with simple supervised tasks. We assess the predictability of the writer characteristics gathered, the effect of using some of those characteristics as predictors in classification task and the effect of higher resolution images on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Machine Learning and Data Classification
