The parallel texts of books translations in the quality evaluation of   basic models and algorithms for the similarity of symbol strings

Sergej V. Znamenskij

arXiv:1805.09776·cs.IR·February 22, 2019

The parallel texts of books translations in the quality evaluation of basic models and algorithms for the similarity of symbol strings

Sergej V. Znamenskij

PDF

Open Access

TL;DR

This paper proposes a numeric evaluation method for string similarity metrics based on ranking translated paragraphs by similarity, providing an objective way to assess translation quality and identify the most accurate metrics.

Contribution

It introduces a novel, reproducible evaluation approach for string similarity metrics using parallel texts to measure translation quality.

Findings

01

Identifies the most accurate string similarity metrics for translation evaluation.

02

Provides a reproducible method for assessing translation quality.

03

Demonstrates the effectiveness of the proposed evaluation approach.

Abstract

This numeric evaluation of string metric accuracy is based on the following idea: taking the paragraph of text in one language sort all paragraphs of the document in other language by similarity with given paragraph string and consider place of the right translation as the value of the evaluation score. Such a search of proper translation provides an objective and reproducible quality assessment for known similarity metrics and shows the most accurate ones.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Data Quality and Management · Natural Language Processing Techniques