# Machine translationese of large language models: Dependency triplets, text classification, and SHAP analysis

**Authors:** Shukang Zhang, Chaoyong Zhao, Alessio Luschi, Alessio Luschi, Alessio Luschi, Alessio Luschi

PMC · DOI: 10.1371/journal.pone.0339769 · PLOS One · 2026-01-09

## TL;DR

This study uses machine learning to detect whether a translation was made by a human or an AI, achieving high accuracy and identifying key linguistic patterns.

## Contribution

The study introduces dependency triplet features and SHAP analysis to detect LLM-generated translations with high accuracy.

## Key findings

- The SVM model achieved a 93% mean F1-score in distinguishing human and machine translations.
- SHAP analysis identified key dependency features that differentiate human and machine translations.
- The approach works across languages and text genres, offering insights for improving translation models.

## Abstract

This study addresses the challenge of distinguishing human translations from those generated by Large Language Models (LLMs) by utilizing dependency triplet features and evaluating 16 machine learning classifiers. Using 10-fold cross-validation, the SVM model achieves the highest mean F1-score of 93%, while all other classifiers consistently differentiate between human and machine translations. SHAP analysis helps identify key dependency features that distinguish human and machine translations, improving our understanding of how LLMs produce translationese. The findings provide practical insights for enhancing translation quality assessment and refining translation models across various languages and text genres, contributing to the advancement of natural language processing techniques. The dataset and implementation code of our study are available at: https://github.com/KiemaG5/LLM-translationese.

## Full-text entities

- **Genes:** SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12788636/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12788636/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/PMC12788636/full.md

---
Source: https://tomesphere.com/paper/PMC12788636