BlueLM-2.5-3B Technical Report

Baojiao Xiong; Boheng Chen; Chengzhi Wang; Daxiong Luo; Dongsheng Xu; Dongyang Liu; Fan Yang; Fangyuan Li; Fei Teng; Feng Wang; Fukang Qin; Fuquan Peng; Guanxin Tan; Guozhi Wang; Haibo Yu; Haohao Gao; Heng Liu; Hongbo Yang; Hongjian Zou; Houzheng Shen; Hu Meng; Huan Li; Hui Tan; Jiali Chen; Jianzhao Chen; Jinliang Zhu; Kai Wang; Lei Wu; Liangbing Liu; Liuyang Bian; Liyan He; Long Liu; Peiwen Li; Penggang Shi; Qi Ding; Rui Hu; Shuai Cao; Shuai Ren; Shuang Peng; Teng Xie; Weiji Chen; Weilin Xiang; Weixin Wu; Xi Yin; Xiaoxin Chen; Xu Chen; Yafei Wen; Yan Hu; Yanzhou Yang; Yina Xie; Yinghao Chen; Yixuan Liao; Yu Geng; Yuanjiang Ouyang; Yuanzhuo Yang; Yuehua He; Yushuai Peng; Zhaoxiong Wang; Zheng Wang; Zhibo Zhou; Ziyang Wu

arXiv:2507.05934·cs.AI·July 9, 2025

BlueLM-2.5-3B Technical Report

Baojiao Xiong, Boheng Chen, Chengzhi Wang, Daxiong Luo, Dongsheng Xu, Dongyang Liu, Fan Yang, Fangyuan Li, Fei Teng, Feng Wang, Fukang Qin, Fuquan Peng, Guanxin Tan, Guozhi Wang, Haibo Yu, Haohao Gao, Heng Liu, Hongbo Yang, Hongjian Zou, Houzheng Shen, Hu Meng, Huan Li, Hui Tan

PDF

Open Access

TL;DR

BlueLM-2.5-3B is a compact, versatile multimodal large language model capable of both thinking and non-thinking modes, optimized for edge deployment with strong performance across multimodal and text benchmarks.

Contribution

First 3B-scale MLLM supporting both thinking and non-thinking modes with explicit control over thinking tokens, developed through diversified data and hybrid reinforcement learning.

Findings

01

Achieves performance comparable to larger models in thinking mode.

02

Outperforms smaller models in non-thinking mode on multimodal benchmarks.

03

Exhibits high data efficiency with less training data than larger counterparts.

Abstract

We present BlueLM-2.5-3B, a compact and unified dense Multimodal Large Language Model (MLLM) designed for efficient edge-device deployment, offering strong general-purpose and reasoning capabilities. To the best of our knowledge, this is the first 3B-scale MLLM to support both thinking and non-thinking modes, while also enabling explicit control over thinking token budget. BlueLM-2.5-3B is developed through diversified data curation, key data resampling, hybrid heterogeneous reinforcement learning, and a high-performance training infrastructure. Our model achieves superior multimodal capacity while preserving competitive pure-text performance with only 2.9 billion parameters. We conduct comprehensive evaluations across a broad range of multimodal and text-only benchmarks. In thinking mode, BlueLM-2.5-3B achieves comparable performance to Qwen3-4B on text-only benchmarks, and trails the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Advanced Neural Network Applications