Leveraging Deep Learning with Multi-Head Attention for Accurate Extraction of Medicine from Handwritten Prescriptions
Usman Ali, Sahil Ranmbail, Muhammad Nadeem, Hamid Ishfaq, Muhammad, Umer Ramzan, Waqas Ali

TL;DR
This paper introduces a novel deep learning approach combining Mask R-CNN and Transformer-based OCR with Multi-Head Attention to accurately extract medicine names from handwritten prescriptions, addressing variability in handwriting styles.
Contribution
It presents a new integrated model and a diverse dataset for improved handwritten medicine extraction, enhancing accuracy over existing methods.
Findings
Achieved a CER of 1.4% on standard benchmarks.
Successfully handled diverse handwriting styles from Pakistani prescriptions.
Demonstrated robustness and efficiency in extracting medicine names.
Abstract
Extracting medication names from handwritten doctor prescriptions is challenging due to the wide variability in handwriting styles and prescription formats. This paper presents a robust method for extracting medicine names using a combination of Mask R-CNN and Transformer-based Optical Character Recognition (TrOCR) with Multi-Head Attention and Positional Embeddings. A novel dataset, featuring diverse handwritten prescriptions from various regions of Pakistan, was utilized to fine-tune the model on different handwriting styles. The Mask R-CNN model segments the prescription images to focus on the medicinal sections, while the TrOCR model, enhanced by Multi-Head Attention and Positional Embeddings, transcribes the isolated text. The transcribed text is then matched against a pre-existing database for accurate identification. The proposed approach achieved a character error rate (CER) of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques
MethodsAttention Is All You Need · Linear Layer · Region Proposal Network · RoIAlign · Dense Connections · Residual Connection · Multi-Head Attention · Position-Wise Feed-Forward Layer · Convolution · Layer Normalization
