Multichannel LSTM-CNN for Telugu Technical Domain Identification
Sunil Gundapu, Radhika Mamidi

TL;DR
This paper introduces a Multichannel LSTM-CNN model for identifying technical domains in Telugu text, achieving notable F1 scores in a shared task, advancing NLP applications for low-resource languages.
Contribution
The paper presents a novel Multichannel LSTM-CNN architecture specifically designed for Telugu technical domain identification, demonstrating its effectiveness in a competitive setting.
Findings
F1 score of 69.9% on test data
F1 score of 90.01% on validation data
Effective for Telugu NLP domain identification
Abstract
With the instantaneous growth of text information, retrieving domain-oriented information from the text data has a broad range of applications in Information Retrieval and Natural language Processing. Thematic keywords give a compressed representation of the text. Usually, Domain Identification plays a significant role in Machine Translation, Text Summarization, Question Answering, Information Extraction, and Sentiment Analysis. In this paper, we proposed the Multichannel LSTM-CNN methodology for Technical Domain Identification for Telugu. This architecture was used and evaluated in the context of the ICON shared task TechDOfication 2020 (task h), and our system got 69.9% of the F1 score on the test dataset and 90.01% on the validation set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
