Algorithme de recherche approximative dans un dictionnaire fond\'e sur une distance d'\'edition d\'efinie par blocs
Pascal Vaillant

TL;DR
The paper introduces an approximate string matching algorithm for dictionaries that uses a customizable divergence function based on block character costs, improving flexibility over classical edit distances.
Contribution
It presents a novel approximate string search algorithm utilizing a block-based divergence function tailored to specific corpora, enhancing matching accuracy.
Findings
The algorithm effectively matches altered strings within a threshold.
It adapts to different corpora through customizable block costs.
Performance surpasses traditional character-based edit distances.
Abstract
We propose an algorithm for approximative dictionary lookup, where altered strings are matched against reference forms. The algorithm makes use of a divergence function between strings -- broadly belonging to the family of edit distances; it finds dictionary entries whose distance to the search string is below a certain threshold. The divergence function is not the classical edit distance (DL distance); it is adaptable to a particular corpus, and is based on elementary alteration costs defined on character blocks, rather than on individual characters. Nous proposons un algorithme de recherche approximative de cha\^ines dans un dictionnaire \`a partir de formes alt\'er\'ees. Cet algorithme est fond\'e sur une fonction de divergence entre cha\^ines~ -- une sorte de distance d'\'edition: il recherche des entr\'ees pour lesquelles la distance \`a la cha\^ine cherch\'ee est inf\'erieure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Rough Sets and Fuzzy Logic · semigroups and automata theory
