Unified Line and Paragraph Detection by Graph Convolutional Networks

Shuang Liu; Renshen Wang; Michalis Raptis; Yasuhisa Fujii

arXiv:2203.09638·cs.CV·March 21, 2022

Unified Line and Paragraph Detection by Graph Convolutional Networks

Shuang Liu, Renshen Wang, Michalis Raptis, Yasuhisa Fujii

PDF

TL;DR

This paper introduces a unified approach using graph convolutional networks to detect lines and paragraphs in documents, effectively modeling layout as a two-level clustering problem for improved accuracy and efficiency.

Contribution

The paper presents a novel unified method employing graph convolutional networks to simultaneously detect lines and paragraphs as a hierarchical clustering problem.

Findings

01

Achieves state-of-the-art paragraph detection accuracy

02

Demonstrates high efficiency in processing document layouts

03

Effective in both benchmarks and real-world images

Abstract

We formulate the task of detecting lines and paragraphs in a document into a unified two-level clustering problem. Given a set of text detection boxes that roughly correspond to words, a text line is a cluster of boxes and a paragraph is a cluster of lines. These clusters form a two-level tree that represents a major part of the layout of a document. We use a graph convolutional network to predict the relations between text detection boxes and then build both levels of clusters from these predictions. Experimentally, we demonstrate that the unified approach can be highly efficient while still achieving state-of-the-art quality for detecting paragraphs in public benchmarks and real-world images.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.