Tamil Language Computing: the Present and the Future
Kengatharaiyer Sarveswaran

TL;DR
This paper reviews the current state and future prospects of Tamil language computing, emphasizing advancements, resources, challenges, and the need for further research to improve digital communication in Tamil.
Contribution
It provides a comprehensive overview of Tamil language processing, highlighting recent technological developments and identifying gaps and future directions for research and application.
Findings
Transition from ASCII to Unicode improved digital communication.
Development of computational resources like annotated data and grammars.
Need for high-quality data and advanced models for Tamil language processing.
Abstract
This paper delves into the text processing aspects of Language Computing, which enables computers to understand, interpret, and generate human language. Focusing on tasks such as speech recognition, machine translation, sentiment analysis, text summarization, and language modelling, language computing integrates disciplines including linguistics, computer science, and cognitive psychology to create meaningful human-computer interactions. Recent advancements in deep learning have made computers more accessible and capable of independent learning and adaptation. In examining the landscape of language computing, the paper emphasises foundational work like encoding, where Tamil transitioned from ASCII to Unicode, enhancing digital communication. It discusses the development of computational resources, including raw data, dictionaries, glossaries, annotated data, and computational grammars,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Natural Language Processing Techniques
