Methods of Weighted Combination for Text Field Recognition in a Video Stream
Olga Petrova, Konstantin Bulatov, Vladimir L. Arlazarov

TL;DR
This paper introduces a weighted combination method for improving text recognition accuracy in video streams captured by mobile devices, addressing distortions and leveraging multiple frames for better results.
Contribution
It proposes a novel weighted combination approach for text recognition results from video streams, with validation through experimental data demonstrating its effectiveness.
Findings
Weighted combination improves recognition accuracy.
Method effective under various image distortions.
Experimental results confirm the approach's validity.
Abstract
Due to a noticeable expansion of document recognition applicability, there is a high demand for recognition on mobile devices. A mobile camera, unlike a scanner, cannot always ensure the absence of various image distortions, therefore the task of improving the recognition precision is relevant. The advantage of mobile devices over scanners is the ability to use video stream input, which allows to get multiple images of a recognized document. Despite this, not enough attention is currently paid to the issue of combining recognition results obtained from different frames when using video stream input. In this paper we propose a weighted text string recognition results combination method and weighting criteria, and provide experimental data for verifying their validity and effectiveness. Based on the obtained results, it is concluded that the use of such weighted combination is appropriate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
