Introducing cosmosGPT: Monolingual Training for Turkish Language Models
H. Toprak Kesgin, M. Kaan Yuce, Eren Dogan, M. Egemen Uzun, Atahan Uz,, H. Emre Seyrek, Ahmed Zeer, M. Fatih Amasyali

TL;DR
This paper introduces cosmosGPT, a Turkish language model trained solely on Turkish data, demonstrating promising performance despite its smaller size compared to multilingual models.
Contribution
The study presents cosmosGPT, a monolingual Turkish language model, along with new datasets for fine-tuning and evaluation, and compares its capabilities to existing models.
Findings
cosmosGPT performs well despite being 10 times smaller
New datasets improve Turkish language model evaluation
Monolingual training is a viable alternative to multilingual models
Abstract
The number of open source language models that can produce Turkish is increasing day by day, as in other languages. In order to create the basic versions of such models, the training of multilingual models is usually continued with Turkish corpora. The alternative is to train the model with only Turkish corpora. In this study, we first introduce the cosmosGPT models that we created with this alternative method. Then, we introduce new finetune datasets for basic language models to fulfill user requests and new evaluation datasets for measuring the capabilities of Turkish language models. Finally, a comprehensive comparison of the adapted Turkish language models on different capabilities is presented. The results show that the language models we built with the monolingual corpus have promising performance despite being about 10 times smaller than the others.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗ytu-ce-cosmos/turkish-gpt2model· 1.8k dl· ♡ 151.8k dl♡ 15
- 🤗ytu-ce-cosmos/turkish-gpt2-mediummodel· 294 dl· ♡ 10294 dl♡ 10
- 🤗ytu-ce-cosmos/turkish-gpt2-largemodel· 571 dl· ♡ 44571 dl♡ 44
- 🤗ytu-ce-cosmos/turkish-gpt2-large-750m-instruct-v0.1model· 182 dl· ♡ 41182 dl♡ 41
- 🤗ytu-ce-cosmos/turkish-gpt2-medium-350m-instruct-v0.1model· 284 dl· ♡ 12284 dl♡ 12
- 🤗RichardErkhov/ytu-ce-cosmos_-_turkish-gpt2-large-4bitsmodel
- 🤗RichardErkhov/ytu-ce-cosmos_-_turkish-gpt2-large-8bitsmodel
- 🤗RichardErkhov/ytu-ce-cosmos_-_turkish-gpt2-large-750m-instruct-v0.1-4bitsmodel· 1 dl1 dl
- 🤗RichardErkhov/ytu-ce-cosmos_-_turkish-gpt2-large-750m-instruct-v0.1-8bitsmodel
- 🤗ytu-ce-cosmos/previous-token-prediction-turkish-gpt2-largemodel· 7 dl· ♡ 87 dl♡ 8
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
