Newer method of string comparison: the Modified Moving Contracting Window Pattern Algorithm
Tiago Tresoldi

TL;DR
This paper introduces the Modified Moving Contracting Window Pattern Algorithm (CMCWPM), improving string comparison accuracy by correctly handling boundaries, with a Python implementation provided.
Contribution
It presents a novel string comparison algorithm that corrects previous boundary handling issues, enhancing similarity calculation accuracy.
Findings
Improved accuracy in field similarity measurement.
Correct handling of pattern boundaries in string comparison.
Provides a reference Python implementation.
Abstract
This paper presents a new algorithm, the Modified Moving Contracting Window Pattern Algorithm (CMCWPM), for the calculation of field similarity. It strongly relies on previous work by Yang et al. (2001), correcting previous work in which characters marked as inaccessible for further pattern matching were not treated as boundaries between subfields, occasionally leading to higher than expected scores of field similarity. A reference Python implementation is provided.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Web Data Mining and Analysis · Natural Language Processing Techniques
