Advancing Question Answering on Handwritten Documents: A State-of-the-Art Recognition-Based Model for HW-SQuAD
Aniket Pal, Ajoy Mondal, C.V. Jawahar

TL;DR
This paper introduces a recognition-based model for question answering on handwritten documents, utilizing transformer retrieval and ensemble methods to significantly improve accuracy on HW-SQuAD and BenthamQA datasets.
Contribution
It presents a novel recognition-based approach with transformer retrieval and ensemble techniques that outperforms previous state-of-the-art models in handwritten document question answering.
Findings
Achieved 82.02% Exact Match on HW-SQuAD
Boosted document retrieval accuracy to 95.30%
Surpassed previous best recognition-based methods by significant margins
Abstract
Question-answering handwritten documents is a challenging task with numerous real-world applications. This paper proposes a novel recognition-based approach that improves upon the previous state-of-the-art on the HW-SQuAD and BenthamQA datasets. Our model incorporates transformer-based document retrieval and ensemble methods at the model level, achieving an Exact Match score of 82.02% and 69% in HW-SQuAD and BenthamQA datasets, respectively, surpassing the previous best recognition-based approach by 10.89% and 3%. We also enhance the document retrieval component, boosting the top-5 retrieval accuracy from 90% to 95.30%. Our results demonstrate the significance of our proposed approach in advancing question answering on handwritten documents. The code and trained models will be publicly available to facilitate future research in this critical area of natural language.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques
