A Feature Engineering Approach for Literary and Colloquial Tamil Speech   Classification using 1D-CNN

M. Nanmalar; S. Johanan Joysingh; P. Vijayalakshmi; T. Nagarajan

arXiv:2409.14348·eess.AS·September 24, 2024

A Feature Engineering Approach for Literary and Colloquial Tamil Speech Classification using 1D-CNN

M. Nanmalar, S. Johanan Joysingh, P. Vijayalakshmi, T. Nagarajan

PDF

Open Access

TL;DR

This paper presents a lightweight 1D-CNN classifier that effectively distinguishes between literary and colloquial Tamil speech using handcrafted features and MFCCs, achieving high accuracy for improved language processing.

Contribution

It introduces a novel feature engineering approach combined with 1D-CNN for Tamil speech form classification, enhancing accuracy over traditional methods.

Findings

01

Handcrafted features outperform MFCC alone in classification accuracy.

02

Combining top handcrafted features with MFCC yields the highest F1 score of 0.9946.

03

The proposed method is effective for real-time language form identification in HCI.

Abstract

In ideal human computer interaction (HCI), the colloquial form of a language would be preferred by most users, since it is the form used in their day-to-day conversations. However, there is also an undeniable necessity to preserve the formal literary form. By embracing the new and preserving the old, both service to the common man (practicality) and service to the language itself (conservation) can be rendered. Hence, it is ideal for computers to have the ability to accept, process, and converse in both forms of the language, as required. To address this, it is first necessary to identify the form of the input speech, which in the current work is between literary and colloquial Tamil speech. Such a front-end system must consist of a simple, effective, and lightweight classifier that is trained on a few effective features that are capable of capturing the underlying patterns of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing

Methodstravel james