Ensemble of Anchor-Free Models for Robust Bangla Document Layout   Segmentation

U Mong Sain Chak; Md. Asib Rahman

arXiv:2308.14397·cs.CV·August 30, 2023

Ensemble of Anchor-Free Models for Robust Bangla Document Layout Segmentation

U Mong Sain Chak, Md. Asib Rahman

PDF

Open Access

TL;DR

This paper presents an ensemble of anchor-free YOLOv8 models trained with augmented data and Bayesian optimization to improve the robustness of Bangla document layout segmentation.

Contribution

It introduces a novel ensemble approach using anchor-free models with optimized thresholds, enhancing Bangla document layout segmentation accuracy.

Findings

01

Improved cross-validation scores with data augmentation and ensemble techniques.

02

Bayesian optimization effectively determines optimal confidence and IoU thresholds.

03

Demonstrated robustness of anchor-free models in complex document layouts.

Abstract

In this research paper, we introduce a novel approach designed for the purpose of segmenting the layout of Bangla documents. Our methodology involves the utilization of a sophisticated ensemble of YOLOv8 models, which were trained for the DL Sprint 2.0 - BUET CSE Fest 2023 Competition focused on Bangla document layout segmentation. Our primary emphasis lies in enhancing various aspects of the task, including techniques such as image augmentation, model architecture, and the incorporation of model ensembles. We deliberately reduce the quality of a subset of document images to enhance the resilience of model training, thereby resulting in an improvement in our cross-validation score. By employing Bayesian optimization, we determine the optimal confidence and Intersection over Union (IoU) thresholds for our model ensemble. Through our approach, we successfully demonstrate the effectiveness…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Image Retrieval and Classification Techniques · Image Processing and 3D Reconstruction

MethodsYou Only Look Once · Focus