Falcon2-11B Technical Report
Quentin Malartic, Nilabhra Roy Chowdhury, Ruxandra Cojocaru, Mugariya, Farooq, Giulia Campesan, Yasser Abdelaziz Dahou Djilali, Sanath Narayan,, Ankit Singh, Maksim Velikanov, Basma El Amel Boussaha, Mohammed Al-Yafeai,, Hamza Alobeidli, Leen Al Qadi, Mohamed El Amine Seddik

TL;DR
Falcon2-11B and its multimodal version Falcon2-11B-vlm are large foundation models trained on extensive datasets, demonstrating strong performance across various benchmarks and tasks, with insights into training strategies and effects.
Contribution
Introduces Falcon2-11B and Falcon2-11B-vlm models, detailing their training process, evaluation, and open-sourcing, advancing large-scale foundation and multimodal models.
Findings
Strong generalization across benchmarks
Higher average scores than similar open-source models
Training insights on batch size and learning rate effects
Abstract
We introduce Falcon2-11B, a foundation model trained on over five trillion tokens, and its multimodal counterpart, Falcon2-11B-vlm, which is a vision-to-text model. We report our findings during the training of the Falcon2-11B which follows a multi-stage approach where the early stages are distinguished by their context length and a final stage where we use a curated, high-quality dataset. Additionally, we report the effect of doubling the batch size mid-training and how training loss spikes are affected by the learning rate. The downstream performance of the foundation model is evaluated on established benchmarks, including multilingual and code datasets. The foundation model shows strong generalization across all the tasks which makes it suitable for downstream finetuning use cases. For the vision language model, we report the performance on several benchmarks and show that our model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Measurement and Detection Methods · Aerospace and Aviation Technology · Advanced Sensor Technologies Research
