Uzbek text summarization based on TF-IDF
Khabibulla Madatov, Shukurla Bekchanov, Jernej Vi\v{c}i\v{c}

TL;DR
This paper introduces a TF-IDF based method for Uzbek text summarization, demonstrating its effectiveness in extracting key information from texts in an under-resourced language.
Contribution
The study presents a novel application of TF-IDF and n-gram techniques for Uzbek text summarization, addressing the lack of existing methods for this language.
Findings
Effective extraction of important text segments using TF-IDF
Summarization approach performs well on the Uzbek 'School corpus'
Potential applications in information retrieval and NLP for Uzbek
Abstract
The volume of information is increasing at an incredible rate with the rapid development of the Internet and electronic information services. Due to time constraints, we don't have the opportunity to read all this information. Even the task of analyzing textual data related to one field requires a lot of work. The text summarization task helps to solve these problems. This article presents an experiment on summarization task for Uzbek language, the methodology was based on text abstracting based on TF-IDF algorithm. Using this density function, semantically important parts of the text are extracted. We summarize the given text by applying the n-gram method to important parts of the whole text. The authors used a specially handcrafted corpus called "School corpus" to evaluate the performance of the proposed method. The results show that the proposed approach is effective in extracting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Educational Technology and Assessment
