The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling Capabilities
MediaTek Research: Chan-Jan Hsu, Chia-Sheng Liu, Meng-Hsi Chen, Muxi, Chen, Po-Chun Hsu, Yi-Chang Chen, Da-Shan Shiu

TL;DR
Breeze2 is a suite of multi-modal Chinese language models based on Llama, enhanced with vision and function-calling capabilities, achieving state-of-the-art performance in its size class and released for public use.
Contribution
The paper introduces Breeze2, a new Chinese LLM with vision and function-calling features, built on Llama 3.2, and demonstrates its superior performance and open-source release.
Findings
Breeze2 outperforms existing models in Chinese function calling.
Breeze2 achieves strong results in vision understanding tasks.
Models are publicly available and demonstrated on mobile platforms.
Abstract
Llama-Breeze2 (hereinafter referred to as Breeze2) is a suite of advanced multi-modal language models, available in 3B and 8B parameter configurations, specifically designed to enhance Traditional Chinese language representation. Building upon the Llama 3.2 model family, we continue the pre-training of Breeze2 on an extensive corpus to enhance the linguistic and cultural heritage of Traditional Chinese. In addition to language modeling capabilities, we significantly augment the models with function calling and vision understanding capabilities. At the time of this publication, as far as we are aware, absent reasoning-inducing prompts, Breeze2 are the strongest performing models in Traditional Chinese function calling and image understanding in its size class. The effectiveness of Breeze2 is benchmarked across various tasks, including Taiwan general knowledge, instruction-following, long…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗MediaTek-Research/BreezyVoicemodel· ♡ 52♡ 52
- 🤗MediaTek-Research/Llama-Breeze2-3B-Instructmodel· 273 dl· ♡ 34273 dl♡ 34
- 🤗MediaTek-Research/Llama-Breeze2-8B-Instructmodel· 1.4k dl· ♡ 511.4k dl♡ 51
- 🤗Qwe1325/Llama-Breeze2-8B-Instruct_4bitmodel· 1 dl1 dl
- 🤗Qwe1325/Llama-Breeze2-3B-Instruct_4bitmodel· 1 dl1 dl
- 🤗Qwe1325/Llama-Breeze2-8B-Instruct_8bitmodel· 3 dl3 dl
- 🤗Qwe1325/Llama-Breeze2-3B-Instruct_8bitmodel· 3 dl3 dl
- 🤗twinkle-ai/Llama-3.2-3B-F1-Instructmodel· 382 dl· ♡ 23382 dl♡ 23
- 🤗ThanatosDi/Llama-Breeze2-8B-Instructmodel· 1 dl1 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSubtitles and Audiovisual Media
MethodsLLaMA
