Survey of Pseudonymization, Abstractive Summarization & Spell Checker   for Hindi and Marathi

Rasika Ransing; Mohammed Amaan Dhamaskar; Ayush Rajpurohit; Amey; Dhoke; Sanket Dalvi

arXiv:2412.18163·cs.CL·December 25, 2024

Survey of Pseudonymization, Abstractive Summarization & Spell Checker for Hindi and Marathi

Rasika Ransing, Mohammed Amaan Dhamaskar, Ayush Rajpurohit, Amey, Dhoke, Sanket Dalvi

PDF

Open Access

TL;DR

This survey reviews NLP tools for Hindi and Marathi, focusing on pseudonymization, abstractive summarization, and spell checking, to support enterprise and consumer needs in India's regional languages.

Contribution

It compiles and analyzes existing NLP tools for Hindi and Marathi, highlighting the current state and challenges in developing language-specific NLP applications.

Findings

01

Limited NLP resources for Hindi and Marathi.

02

Existing tools mainly focus on basic text processing.

03

Need for integrated platforms for regional languages.

Abstract

India's vast linguistic diversity presents unique challenges and opportunities for technological advancement, especially in the realm of Natural Language Processing (NLP). While there has been significant progress in NLP applications for widely spoken languages, the regional languages of India, such as Marathi and Hindi, remain underserved. Research in the field of NLP for Indian regional languages is at a formative stage and holds immense significance. The paper aims to build a platform which enables the user to use various features like text anonymization, abstractive text summarization and spell checking in English, Hindi and Marathi language. The aim of these tools is to serve enterprise and consumer clients who predominantly use Indian Regional Languages.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling