Bielik 11B v2 Technical Report
Krzysztof Ociepa, {\L}ukasz Flis, Krzysztof Wr\'obel, Adrian Gwo\'zdziej, Remigiusz Kinas

TL;DR
Bielik 11B v2 is a highly efficient Polish language model with innovative training techniques that outperform larger models and set new standards for resource-efficient language processing in Polish.
Contribution
The paper introduces Weighted Instruction Cross-Entropy Loss and Adaptive Learning Rate, enhancing training effectiveness and model performance for Polish language modeling.
Findings
Outperforms larger models with fewer parameters
Surpasses other Polish language models on multiple benchmarks
Demonstrates strong cross-lingual capabilities
Abstract
We present Bielik 11B v2, a state-of-the-art language model optimized for Polish text processing. Built on the Mistral 7B v0.2 architecture and scaled to 11B parameters using depth up-scaling, this model demonstrates exceptional performance across Polish language benchmarks while maintaining strong cross-lingual capabilities. We introduce two key technical innovations: Weighted Instruction Cross-Entropy Loss, which optimizes learning across diverse instruction types by assigning quality-based weights to training examples, and Adaptive Learning Rate, which dynamically adjusts based on context length. Comprehensive evaluation across multiple benchmarks demonstrates that Bielik 11B v2 outperforms many larger models, including those with 2-6 times more parameters, and significantly surpasses other specialized Polish language models on tasks ranging from linguistic understanding to complex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗speakleash/Bielik-11B-v2.3-Instructmodel· 11k dl· ♡ 5211k dl♡ 52
- 🤗speakleash/Bielik-11B-v2.5-Instructmodel· 26 dl· ♡ 926 dl♡ 9
- 🤗speakleash/Bielik-11B-v2.6-Instructmodel· 1.6k dl· ♡ 431.6k dl♡ 43
- 🤗adgw/quality_classifier_plmodel· ♡ 4♡ 4
- 🤗speakleash/Bielik-11B-v3.0-Instructmodel· 369k dl· ♡ 56369k dl♡ 56
- 🤗safestack/Bielik-11B-v3.0-Instructmodel· 22 dl22 dl
- 🤗gepardzik/Bielik-11B-v3.0-Instruct-heretic-MPOAmodel· 1 dl1 dl
- 🤗websystemspl/Bielik-11B-v3.0-Instruct-128kmodel· 2 dl2 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTransport and Economic Policies
