On the effectiveness of feature set augmentation using clusters of word   embeddings

Georgios Balikas; Ioannis Partalas

arXiv:1705.01265·cs.CL·July 31, 2018·1 cites

On the effectiveness of feature set augmentation using clusters of word embeddings

Georgios Balikas, Ioannis Partalas

PDF

Open Access

TL;DR

This paper systematically evaluates how augmenting feature sets with word cluster membership improves performance across various NLP tasks, highlighting the importance of such features.

Contribution

It provides a comprehensive analysis of the impact of cluster membership features on multiple NLP tasks, clarifying their role and effectiveness.

Findings

01

Cluster features improve task performance

02

Systematic evaluation across four NLP tasks

03

Supports use of cluster features in feature engineering

Abstract

Word clusters have been empirically shown to offer important performance improvements on various tasks. Despite their importance, their incorporation in the standard pipeline of feature engineering relies more on a trial-and-error procedure where one evaluates several hyper-parameters, like the number of clusters to be used. In order to better understand the role of such features we systematically evaluate their effect on four tasks, those of named entity segmentation and classification as well as, those of five-point sentiment classification and quantification. Our results strongly suggest that cluster membership features improve the performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques