Neural String Edit Distance

Jind\v{r}ich Libovick\'y; Alexander Fraser

arXiv:2104.08388·cs.CL·April 28, 2022

Neural String Edit Distance

Jind\v{r}ich Libovick\'y, Alexander Fraser

PDF

Open Access 1 Repo

TL;DR

The paper introduces a neural string edit distance model that improves string-pair matching and transduction by integrating learnable, differentiable edit distance into neural networks, balancing performance and interpretability.

Contribution

It modifies the expectation-maximization algorithm into a differentiable loss, enabling neural integration for enhanced string matching and transduction tasks.

Findings

01

Contextual representations achieve state-of-the-art performance.

02

Static embeddings offer interpretability with some accuracy trade-off.

03

Framework is versatile for various string-related tasks.

Abstract

We propose the neural string edit distance model for string-pair matching and string transduction based on learnable string edit distance. We modify the original expectation-maximization learned edit distance algorithm into a differentiable loss function, allowing us to integrate it into a neural network providing a contextual representation of the input. We evaluate on cognate detection, transliteration, and grapheme-to-phoneme conversion, and show that we can trade off between performance and interpretability in a single framework. Using contextual representations, which are difficult to interpret, we match the performance of state-of-the-art string-pair matching models. Using static embeddings and a slightly different loss function, we force interpretability, at the expense of an accuracy drop.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jlibovicky/neural-string-edit-distance
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis