Levenshtein Distance Technique in Dictionary Lookup Methods: An Improved Approach
Rishin Haldar, Debajyoti Mukhopadhyay

TL;DR
This paper introduces an improved Levenshtein distance method for dictionary lookup, enhancing accuracy by grouping similar characters, which reduces search overhead and improves recognition of ambiguous OCR letters.
Contribution
The paper proposes a novel modification to the Levenshtein distance technique by grouping similar characters, leading to better performance in dictionary lookup tasks.
Findings
Marked improvement over traditional Levenshtein distance
Reduced search overhead in dictionary lookup
Enhanced recognition of ambiguous OCR characters
Abstract
Dictionary lookup methods are popular in dealing with ambiguous letters which were not recognized by Optical Character Readers. However, a robust dictionary lookup method can be complex as apriori probability calculation or a large dictionary size increases the overhead and the cost of searching. In this context, Levenshtein distance is a simple metric which can be an effective string approximation tool. After observing the effectiveness of this method, an improvement has been made to this method by grouping some similar looking alphabets and reducing the weighted difference among members of the same group. The results showed marked improvement over the traditional Levenshtein distance technique.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Lexicography and Language Studies · Algorithms and Data Compression
