HY-MT1.5 Technical Report
Mao Zheng, Zheng Li, Tao Chen, Mingyang Song, Di Wang

TL;DR
This paper introduces HY-MT1.5 translation models with a holistic training framework, achieving high performance and efficiency across multiple translation benchmarks and supporting advanced translation constraints.
Contribution
The paper presents a new family of translation models with a holistic training approach, outperforming larger open-source and commercial models at similar sizes.
Findings
HY-MT1.5-1.8B outperforms larger open-source and commercial models.
HY-MT1.5-7B achieves state-of-the-art results for its size class.
Models support advanced translation constraints like terminology and context.
Abstract
In this report, we introduce our latest translation models, HY-MT1.5-1.8B and HY-MT1.5-7B, a new family of machine translation models developed through a holistic training framework tailored for high-performance translation. Our methodology orchestrates a multi-stage pipeline that integrates general and MT-oriented pre-training, supervised fine-tuning, on-policy distillation, and reinforcement learning. HY-MT1.5-1.8B, the 1.8B-parameter model demonstrates remarkable parameter efficiency, comprehensively outperforming significantly larger open-source baselines (e.g., Tower-Plus-72B, Qwen3-32B) and mainstream commercial APIs (e.g., Microsoft Translator, Doubao Translator) in standard Chinese-foreign and English-foreign tasks. It achieves approximately 90% of the performance of ultra-large proprietary models such as Gemini-3.0-Pro, while marginally trailing Gemini-3.0-Pro on WMT25 and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗tencent/HY-MT1.5-1.8Bmodel· 20k dl· ♡ 59120k dl♡ 591
- 🤗tencent/HY-MT1.5-7Bmodel· 5.4k dl· ♡ 1455.4k dl♡ 145
- 🤗tencent/HY-MT1.5-7B-GPTQ-Int4model· 945 dl· ♡ 10945 dl♡ 10
- 🤗tencent/HY-MT1.5-1.8B-FP8model· 586 dl· ♡ 15586 dl♡ 15
- 🤗tencent/HY-MT1.5-7B-FP8model· 472 dl· ♡ 14472 dl♡ 14
- 🤗tencent/HY-MT1.5-1.8B-GPTQ-Int4model· 992 dl· ♡ 12992 dl♡ 12
- 🤗Mungert/HY-MT1.5-7B-GGUFmodel· 78 dl78 dl
- 🤗cuisw/HY-MT1.5-1.8Bmodel· 10 dl10 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Big Data and Digital Economy · Topic Modeling
