Explaining Translationese: why are Neural Classifiers Better and what do   they Learn?

Kwabena Amponsah-Kaakyire; Daria Pylypenko; Josef van Genabith and; Cristina Espa\~na-Bonet

arXiv:2210.13391·cs.CL·October 25, 2022

Explaining Translationese: why are Neural Classifiers Better and what do they Learn?

Kwabena Amponsah-Kaakyire, Daria Pylypenko, Josef van Genabith and, Cristina Espa\~na-Bonet

PDF

Open Access

TL;DR

Neural classifiers like BERT outperform traditional methods in translationese detection mainly due to the richer features they learn, which include topic and correlation information, rather than the classifier architecture itself.

Contribution

The study disentangles the effects of features and classifiers, showing that feature quality drives neural classifier performance and revealing what neural models learn in translationese tasks.

Findings

01

BERT representations match SVM performance with BERT features.

02

Handcrafted features are a subset of BERT's learned features.

03

BERT captures topic and spurious correlations influencing results.

Abstract

Recent work has shown that neural feature- and representation-learning, e.g. BERT, achieves superior performance over traditional manual feature engineering based approaches, with e.g. SVMs, in translationese classification tasks. Previous research did not show $(i)$ whether the difference is because of the features, the classifiers or both, and $(ii)$ what the neural classifiers actually learn. To address $(i)$ , we carefully design experiments that swap features between BERT- and SVM-based classifiers. We show that an SVM fed with BERT representations performs at the level of the best BERT classifiers, while BERT learning and using handcrafted features performs at the level of an SVM using handcrafted features. This shows that the performance differences are due to the features. To address $(ii)$ we use integrated gradients and find that $(a)$ there is indication that information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Machine Learning and Data Classification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Residual Connection · Dropout · WordPiece · Dense Connections · Softmax