FootGPT : A Large Language Model Development Experiment on a Minimal   Setting

Eren Unlu

arXiv:2308.08610·cs.CL·August 21, 2023

FootGPT : A Large Language Model Development Experiment on a Minimal Setting

Eren Unlu

PDF

Open Access

TL;DR

This paper explores developing a specialized language model for soccer data using minimal resources, emphasizing dataset content and training strategy over model size or training duration.

Contribution

It demonstrates that a purpose-specific language model can be effectively trained with limited data and resources, focusing on dataset curation and training methods.

Findings

01

Effective fine-tuning with a small dataset is possible.

02

Dataset quality and content are crucial for model performance.

03

Short training duration suffices for minimal setting experiments.

Abstract

With recent empirical observations, it has been argued that the most significant aspect of developing accurate language models may be the proper dataset content and training strategy compared to the number of neural parameters, training duration or dataset size. Following this argument, we opted to fine tune a one billion parameter size trained general purpose causal language model with a dataset curated on team statistics of the Italian football league first ten game weeks, using low rank adaptation. The limited training dataset was compiled based on a framework where a powerful commercial large language model provides distilled paragraphs and question answer pairs as intended. The training duration was kept relatively short to provide a basis for our minimal setting exploration. We share our key observations on the process related to developing a specific purpose language model which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications