A Hybrid Approach Towards Two Stage Bengali Question Classification Utilizing Smart Data Balancing Technique
Md. Hasibur Rahman, Chowdhury Rafeed Rahman, Ruhul Amin, Md. Habibur, Rahman Sifat, Afra Anika

TL;DR
This paper presents a two-stage Bengali question classification system using CNN and SGD classifiers, enhanced by smart data balancing and word embeddings, improving accuracy in classifying factoid questions.
Contribution
It introduces a novel hybrid two-stage classification approach for Bengali questions, combining CNN with SGD classifiers and a smart data balancing technique.
Findings
Effective classification of Bengali factoid questions achieved
Smart data balancing improves CNN training
Two-stage approach enhances fine-grained classification accuracy
Abstract
Question classification (QC) is the primary step of the Question Answering (QA) system. Question Classification (QC) system classifies the questions in particular classes so that Question Answering (QA) System can provide correct answers for the questions. Our system categorizes the factoid type questions asked in natural language after extracting features of the questions. We present a two stage QC system for Bengali. It utilizes one dimensional convolutional neural network for classifying questions into coarse classes in the first stage. Word2vec representation of existing words of the question corpus have been constructed and used for assisting 1D CNN. A smart data balancing technique has been employed for giving data hungry convolutional neural network the advantage of a greater number of effective samples to learn from. For each coarse class, a separate Stochastic Gradient Descent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsStochastic Gradient Descent
