Bielik-Minitron-7B: Compressing Large Language Models via Structured Pruning and Knowledge Distillation for the Polish Language

Remigiusz Kinas; Pawe{\l} Kiszczak; Sergio P. Perez; Krzysztof Ociepa; {\L}ukasz Flis; Krzysztof Wr\'obel; Adrian Gwo\'zdziej

arXiv:2603.11881·cs.CL·March 13, 2026

Bielik-Minitron-7B: Compressing Large Language Models via Structured Pruning and Knowledge Distillation for the Polish Language

Remigiusz Kinas, Pawe{\l} Kiszczak, Sergio P. Perez, Krzysztof Ociepa, {\L}ukasz Flis, Krzysztof Wr\'obel, Adrian Gwo\'zdziej

PDF

Open Access 1 Models

TL;DR

This paper presents Bielik-Minitron-7B, a compressed Polish language model achieved through structured pruning and knowledge distillation, maintaining high performance while significantly reducing size and inference costs.

Contribution

The paper introduces a novel two-stage compression method combining structured pruning and knowledge distillation tailored for Polish language models.

Findings

01

Reduced model size by 33.4% from 11.04B to 7.35B parameters.

02

Recovered approximately 90% of baseline performance.

03

Achieved up to 50% inference speedup.

Abstract

This report details the creation of Bielik-Minitron-7B, a compressed 7.35B parameter version of the Bielik-11B-v3.0 model, specifically optimized for European languages. By leveraging a two-stage compression methodology inspired by the NVIDIA Minitron approach, we combined structured hybrid pruning and knowledge distillation to reduce the model's parameter count by 33.4%, from 11.04B to 7.35B. We utilized the NVIDIA Model Optimizer for structural pruning and the NVIDIA NeMo Framework for logit-based distillation for quality recovery. Following distillation, the model underwent a rigorous alignment pipeline consisting of Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO-P), and Reinforcement Learning (GRPO). Our final model successfully recovered approximately 90% of the baseline model's performance while providing up to 50% inference speedup. This approach demonstrates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
speakleash/Bielik-Minitron-7B-v3.0-Instruct
model· 3.7k dl· ♡ 17
3.7k dl♡ 17

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Big Data and Digital Economy