Bengali Document Layout Analysis -- A YOLOV8 Based Ensembling Approach
Nazmus Sakib Ahmed, Saad Sakib Noor, Ashraful Islam Shanto Sikder, Abhijit Paul

TL;DR
This paper presents a novel ensemble approach using YOLOv8 and post-processing techniques to improve Bengali Document Layout Analysis, addressing script-specific challenges and outperforming existing methods on the BaDLAD dataset.
Contribution
Introduces a two-stage ensemble model with post-processing for Bengali DLA, enhancing accuracy and robustness over individual architectures.
Findings
Outperforms baseline models on BaDLAD dataset
Effective data augmentation improves model robustness
Two-stage prediction enhances element segmentation accuracy
Abstract
This paper focuses on enhancing Bengali Document Layout Analysis (DLA) using the YOLOv8 model and innovative post-processing techniques. We tackle challenges unique to the complex Bengali script by employing data augmentation for model robustness. After meticulous validation set evaluation, we fine-tune our approach on the complete dataset, leading to a two-stage prediction strategy for accurate element segmentation. Our ensemble model, combined with post-processing, outperforms individual base architectures, addressing issues identified in the BaDLAD dataset. By leveraging this approach, we aim to advance Bengali document analysis, contributing to improved OCR and document comprehension and BaDLAD serves as a foundational resource for this endeavor, aiding future research in the field. Furthermore, our experiments provided key insights to incorporate new strategies into the established…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Retrieval and Classification Techniques · Music and Audio Processing
MethodsYou Only Look Once · Balanced Selection
