A Comparative Study of Machine Learning Methods for Verbal Autopsy Text Classification
Samuel Danso, Eric Atwell, Owen Johnson

TL;DR
This study compares various machine learning methods for classifying Verbal Autopsy texts, highlighting the effectiveness of TFiDF features, SVM classifiers, and a semi-supervised feature reduction approach to improve accuracy.
Contribution
It provides a comprehensive comparison of feature representations, classifiers, and feature reduction strategies specifically for Verbal Autopsy text classification, identifying optimal approaches.
Findings
TFiDF features perform comparably across classifiers.
Support Vector Machine outperforms other classifiers.
Semi-supervised feature reduction enhances classification accuracy.
Abstract
A Verbal Autopsy is the record of an interview about the circumstances of an uncertified death. In developing countries, if a death occurs away from health facilities, a field-worker interviews a relative of the deceased about the circumstances of the death; this Verbal Autopsy can be reviewed off-site. We report on a comparative study of the processes involved in Text Classification applied to classifying Cause of Death: feature value representation; machine learning classification algorithms; and feature reduction strategies in order to identify the suitable approaches applicable to the classification of Verbal Autopsy text. We demonstrate that normalised term frequency and the standard TFiDF achieve comparable performance across a number of classifiers. The results also show Support Vector Machine is superior to other classification algorithms employed in this research. Finally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Text and Document Classification Technologies · Topic Modeling
