TL;DR
This paper presents a novel ranking-based training strategy for multilingual BERT models to detect fine-grained semantic divergences across languages without supervision, improving accuracy in cross-lingual NLP tasks.
Contribution
It introduces a ranking-based training approach using synthetic divergent examples and a new dataset for English-French semantic divergence detection, advancing unsupervised fine-grained divergence analysis.
Findings
Ranking-based training improves divergence detection accuracy.
Token-level predictions help distinguish divergence granularity.
Models outperform strong similarity-based baselines.
Abstract
Detecting fine-grained differences in content conveyed in different languages matters for cross-lingual NLP and multilingual corpora analysis, but it is a challenging machine learning problem since annotation is expensive and hard to scale. This work improves the prediction and annotation of fine-grained semantic divergences. We introduce a training strategy for multilingual BERT models by learning to rank synthetic divergent examples of varying granularity. We evaluate our models on the Rationalized English-French Semantic Divergences, a new dataset released with this work, consisting of English-French sentence-pairs annotated with semantic divergence classes and token-level rationales. Learning to rank helps detect fine-grained sentence-level divergences more accurately than a strong sentence-level similarity model, while token-level predictions have the potential of further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Layer Normalization · Dense Connections · WordPiece · Multi-Head Attention · Dropout · Linear Warmup With Linear Decay · Attention Dropout · Weight Decay · Attention Is All You Need
