First Steps Toward CNN based Source Classification of Document Images   Shared Over Messaging App

Sharad Joshi; Suraj Saxena; Nitin Khanna

arXiv:1808.05941·cs.MM·June 18, 2019

First Steps Toward CNN based Source Classification of Document Images Shared Over Messaging App

Sharad Joshi, Suraj Saxena, Nitin Khanna

PDF

TL;DR

This paper introduces a CNN-based method for identifying the source smartphone of document images shared over messaging apps, using a new dataset of printed documents captured by various smartphones.

Contribution

It presents the first CNN approach for source smartphone classification of document images captured via messaging platforms, along with a new publicly available dataset.

Findings

01

CNN-based system matches or exceeds state-of-the-art handcrafted feature methods

02

New dataset of 315 images from 21 smartphones across three fonts

03

System performs well in real-world messaging app scenarios

Abstract

Knowledge of source smartphone corresponding to a document image can be helpful in a variety of applications including copyright infringement, ownership attribution, leak identification and usage restriction. In this letter, we investigate a convolutional neural network-based approach to solve source smartphone identification problem for printed text documents which have been captured by smartphone cameras and shared over messaging platform. In absence of any publicly available dataset addressing this problem, we introduce a new image dataset consisting of 315 images of documents printed in three different fonts, captured using 21 smartphones and shared over WhatsApp. Experiments conducted on this dataset demonstrate that, in all scenarios, the proposed system performs as well as or better than the state-of-the-art system based on handcrafted features and classification of letters…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.