Approximating solution structure of the Weighted Sentence Alignment problem
Antonina Kolokolova, Renesa Nizamee

TL;DR
This paper investigates the computational complexity of approximating the structure of solutions in weighted sentence alignment problems, showing that achieving high agreement with optimal alignments is NP-hard.
Contribution
It establishes NP-hardness bounds for approximating solution structures in weighted sentence alignment, extending previous results to edit distance metrics.
Findings
Approximating alignment agreement beyond half the positions is NP-hard.
Achieving over 2/3 agreement in the general case is NP-hard.
Similar hardness results hold for edit distance approximation.
Abstract
We study the complexity of approximating solution structure of the bijective weighted sentence alignment problem of DeNero and Klein (2008). In particular, we consider the complexity of finding an alignment that has a significant overlap with an optimal alignment. We discuss ways of representing the solution for the general weighted sentence alignment as well as phrases-to-words alignment problem, and show that computing a string which agrees with the optimal sentence partition on more than half (plus an arbitrarily small polynomial fraction) positions for the phrases-to-words alignment is NP-hard. For the general weighted sentence alignment we obtain such bound from the agreement on a little over 2/3 of the bits. Additionally, we generalize the Hamming distance approximation of a solution structure to approximating it with respect to the edit distance metric, obtaining similar lower…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Algorithms and Data Compression
