CompTLL-UNet: Compressed Domain Text-Line Localization in Challenging   Handwritten Documents using Deep Feature Learning from JPEG Coefficients

Bulla Rajesh; Sk Mahafuz Zaman; Mohammed Javed; P.; Nagabhushan

arXiv:2308.06142·cs.CV·August 14, 2023

CompTLL-UNet: Compressed Domain Text-Line Localization in Challenging Handwritten Documents using Deep Feature Learning from JPEG Coefficients

Bulla Rajesh, Sk Mahafuz Zaman, Mohammed Javed, P., Nagabhushan

PDF

Open Access

TL;DR

This paper introduces CompTLL-UNet, a deep learning model that localizes text lines directly from JPEG compressed images, achieving state-of-the-art results while reducing storage and computational costs.

Contribution

It presents a novel approach that performs text-line localization directly in the JPEG compressed domain using a modified U-Net architecture, avoiding full decompression.

Findings

01

Achieves state-of-the-art performance on benchmark datasets

02

Reduces storage and computational costs

03

Effective in handling complex handwritten document issues

Abstract

Automatic localization of text-lines in handwritten documents is still an open and challenging research problem. Various writing issues such as uneven spacing between the lines, oscillating and touching text, and the presence of skew become much more challenging when the case of complex handwritten document images are considered for segmentation directly in their respective compressed representation. This is because, the conventional way of processing compressed documents is through decompression, but here in this paper, we propose an idea that employs deep feature learning directly from the JPEG compressed coefficients without full decompression to accomplish text-line localization in the JPEG compressed domain. A modified U-Net architecture known as Compressed Text-Line Localization Network (CompTLL-UNet) is designed to accomplish it. The model is trained and tested with JPEG…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Image Processing and 3D Reconstruction

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · Convolution · Max Pooling · U-Net