Abstractive Text Summarization for Contemporary Sanskrit Prose: Issues and Challenges
Shagun Sinha

TL;DR
This thesis develops initial abstractive text summarization models for contemporary Sanskrit prose, addressing unique linguistic challenges and establishing a foundational pipeline for future research in low-resource Sanskrit NLP.
Contribution
It introduces the first pipeline for Sanskrit abstractive summarization, highlighting specific challenges and solutions in data preparation, model training, and inference for this low-resource language.
Findings
Identified key challenges in data collection and preprocessing for Sanskrit.
Developed initial models demonstrating feasibility of Sanskrit abstractive summarization.
Documented challenges and solutions at each stage of model development.
Abstract
This thesis presents Abstractive Text Summarization models for contemporary Sanskrit prose. The first chapter, titled Introduction, presents the motivation behind this work, the research questions, and the conceptual framework. Sanskrit is a low-resource inflectional language. The key research question that this thesis investigates is what the challenges in developing an abstractive TS for Sanskrit. To answer the key research questions, sub-questions based on four different themes have been posed in this work. The second chapter, Literature Review, surveys the previous works done. The third chapter, data preparation, answers the remaining three questions from the third theme. It reports the data collection and preprocessing challenges for both language model and summarization model trainings. The fourth chapter reports the training and inference of models and the results obtained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Text Analysis Techniques
MethodsSpatio-temporal stability analysis
