What Knowledge Is Needed? Towards Explainable Memory for kNN-MT Domain   Adaptation

Wenhao Zhu; Shujian Huang; Yunzhe Lv; Xin Zheng; Jiajun Chen

arXiv:2211.04052·cs.CL·December 21, 2022

What Knowledge Is Needed? Towards Explainable Memory for kNN-MT Domain Adaptation

Wenhao Zhu, Shujian Huang, Yunzhe Lv, Xin Zheng, Jiajun Chen

PDF

Open Access 1 Repo

TL;DR

This paper explores the interpretability of kNN-MT for domain adaptation by introducing local correctness, enabling more explainable memory and identifying conditions where the model may fail.

Contribution

It introduces the concept of local correctness (LAC) to analyze knowledge needs in kNN-MT and proposes pruning methods for more explainable memory.

Findings

01

Pruning based on local correctness improves interpretability.

02

Identifies conditions leading to model failure.

03

Effective across multiple domains and language pairs.

Abstract

kNN-MT presents a new paradigm for domain adaptation by building an external datastore, which usually saves all target language token occurrences in the parallel corpus. As a result, the constructed datastore is usually large and possibly redundant. In this paper, we investigate the interpretability issue of this approach: what knowledge does the NMT model need? We propose the notion of local correctness (LAC) as a new angle, which describes the potential translation correctness for a single entry and for a given neighborhood. Empirical study shows that our investigation successfully finds the conditions where the NMT model could easily fail and need related knowledge. Experiments on six diverse target domains and two language-pairs show that pruning according to local correctness brings a light and more explainable memory for kNN-MT domain adaptation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

njunlp/knn-box
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning

Methodsfail · Pruning