Implementation of an Index Optimize Technology for Highly Specialized Terms based on the Phonetic Algorithm Metaphone
V. Buriachok, M. Hadzhyiev, V. Sokolov, P. Skladannyi, L. Kuzmenko

TL;DR
This paper presents an implementation of an index optimization technology based on the Metaphone phonetic algorithm, tailored to handle highly specialized and linguistically diverse last names, improving data deduplication and search accuracy.
Contribution
It introduces a customized Metaphone-based phonetic algorithm optimized for Ukrainian and Russian names, enhancing fuzzy search and data deduplication in multilingual databases.
Findings
Improved accuracy in matching Ukrainian and Russian last names.
Reduced errors in data entry and deduplication processes.
Enhanced phonetic indexing tailored to specific linguistic rules.
Abstract
When compiling databases, for example to meet the needs of healthcare establishments, there is quite a common problem with the introduction and further processing of names and last names of doctors and patients that are highly specialized both in terms of pronunciation and writing. This is because names and last names of people cannot be unique, their notation is not subject to any rules of phonetics, while their length in different languages may not match. With the advent of the Internet, this situation has become generally critical and can lead to that multiple copies of e-mails are sent to one address. It is possible to solve the specified problem by using phonetic algorithms for comparing words Daitch-Mokotoff, Soundex, NYSIIS, Polyphone, and Metaphone, as well as the Levenshtein and Jaro algorithms, Q-gram-based algorithms, which make it possible to find distances between words.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
