LLM-BT-Terms: Back-Translation as a Framework for Terminology Standardization and Dynamic Semantic Embedding
Li Weigang, Pedro Carvalho Brom

TL;DR
This paper introduces LLM-BT, a back-translation framework using large language models to automate multilingual terminology standardization and semantic alignment, addressing challenges in rapidly evolving technical fields.
Contribution
The paper presents a novel back-translation framework that validates terminology consistency, supports multi-path verification workflows, and reinterprets back-translation as dynamic semantic embedding.
Findings
Over 90% term preservation accuracy across models
BLEU scores exceeding 0.45 indicate strong cross-lingual robustness
Portuguese term accuracy reaches 100% in case studies
Abstract
The rapid expansion of English technical terminology presents a significant challenge to traditional expert-based standardization, particularly in rapidly developing areas such as artificial intelligence and quantum computing. Manual approaches face difficulties in maintaining consistent multilingual terminology. To address this, we introduce LLM-BT, a back-translation framework powered by large language models (LLMs) designed to automate terminology verification and standardization through cross-lingual semantic alignment. Our key contributions include: (1) term-level consistency validation: by performing English -> intermediate language -> English back-translation, LLM-BT achieves high term consistency across different models (such as GPT-4, DeepSeek, and Grok). Case studies demonstrate over 90 percent of terms are preserved either exactly or semantically; (2) multi-path verification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Biomedical Text Mining and Ontologies · linguistics and terminology studies
MethodsAbsolute Position Encodings · Layer Normalization · Byte Pair Encoding · Label Smoothing · Softmax · Dropout · Dense Connections · Transformer · GPT-4
