Linguistically-Informed Neural Architectures for Lexical, Syntactic and   Semantic Tasks in Sanskrit

Jivnesh Sandhan

arXiv:2308.08807·cs.CL·August 21, 2023·1 cites

Linguistically-Informed Neural Architectures for Lexical, Syntactic and Semantic Tasks in Sanskrit

Jivnesh Sandhan

PDF

Open Access

TL;DR

This paper develops linguistically-informed neural models for Sanskrit NLP tasks, addressing challenges like morphology and low-resource constraints, and introduces a toolkit to enhance accessibility of Sanskrit manuscripts.

Contribution

It proposes novel neural architectures tailored for Sanskrit, demonstrating state-of-the-art results and providing a web-based toolkit for real-time linguistic analysis.

Findings

01

Achieved state-of-the-art performance in Sanskrit NLP tasks

02

Developed a web-based toolkit for real-time Sanskrit analysis

03

Enhanced accessibility of Sanskrit manuscripts through NLP technologies

Abstract

The primary focus of this thesis is to make Sanskrit manuscripts more accessible to the end-users through natural language technologies. The morphological richness, compounding, free word orderliness, and low-resource nature of Sanskrit pose significant challenges for developing deep learning solutions. We identify four fundamental tasks, which are crucial for developing a robust NLP technology for Sanskrit: word segmentation, dependency parsing, compound type identification, and poetry analysis. The first task, Sanskrit Word Segmentation (SWS), is a fundamental text processing task for any other downstream applications. However, it is challenging due to the sandhi phenomenon that modifies characters at word boundaries. Similarly, the existing dependency parsing approaches struggle with morphologically rich and low-resource languages like Sanskrit. Compound type identification is also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsFocus