Fine-tuning GPT-3 for Russian Text Summarization
Alexandr Nikolich, Arina Puchkova

TL;DR
This paper demonstrates how fine-tuning GPT-3 for Russian text summarization can outperform existing models, despite some issues with factual accuracy and entity preservation.
Contribution
It shows that fine-tuning ruGPT-3 on Russian news data with hyperparameter tuning improves summarization performance without changing the model architecture.
Findings
Outperforms state-of-the-art models on Russian summarization tasks
Hyperparameter tuning reduces randomness and improves relevance
Model still struggles with entity accuracy and factual consistency
Abstract
Automatic summarization techniques aim to shorten and generalize information given in the text while preserving its core message and the most relevant ideas. This task can be approached and treated with a variety of methods, however, not many attempts have been made to produce solutions specifically for the Russian language despite existing localizations of the state-of-the-art models. In this paper, we aim to showcase ruGPT3 ability to summarize texts, fine-tuning it on the corpora of Russian news with their corresponding human-generated summaries. Additionally, we employ hyperparameter tuning so that the model's output becomes less random and more tied to the original text. We evaluate the resulting texts with a set of metrics, showing that our solution can surpass the state-of-the-art model's performance without additional changes in architecture or loss function. Despite being able…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
