It Takes Two: A Dual Stage Approach for Terminology-Aware Translation
Akshat Singh Jaswal

TL;DR
This paper presents DuTerm, a dual-stage translation system combining a terminology-aware NMT model with a prompt-based LLM for post-editing, improving terminology adherence and translation quality across multiple language pairs.
Contribution
The paper introduces a novel two-stage architecture that integrates a fine-tuned NMT model with an LLM for enhanced terminology-aware translation.
Findings
LLM-based post-editing improves translation quality
Flexible, context-driven terminology handling outperforms strict constraints
Trade-off identified between constraint enforcement and translation quality
Abstract
This paper introduces DuTerm, a novel two-stage architecture for terminology-constrained machine translation. Our system combines a terminology-aware NMT model, adapted via fine-tuning on large-scale synthetic data, with a prompt-based LLM for post-editing. The LLM stage refines NMT output and enforces terminology adherence. We evaluate DuTerm on English-to German, English-to-Spanish, and English-to-Russian with the WMT 2025 Terminology Shared Task corpus. We demonstrate that flexible, context-driven terminology handling by the LLM consistently yields higher quality translations than strict constraint enforcement. Our results highlight a critical trade-off, revealing that an LLM's work best for high-quality translation as context-driven mutators rather than generators.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Topic Modeling
