Searching for Discriminative Words in Multidimensional Continuous   Feature Space

Marius Sajgalik; Michal Barla; Maria Bielikova

arXiv:2211.14631·cs.CL·November 29, 2022

Searching for Discriminative Words in Multidimensional Continuous Feature Space

Marius Sajgalik, Michal Barla, Maria Bielikova

PDF

TL;DR

This paper introduces a novel method for extracting discriminative keywords from word feature vectors to improve text categorization and topical inference, achieving state-of-the-art results with minimal keywords.

Contribution

It presents a new approach to identify discriminative words using word feature vectors, enhancing document representation for better topic discrimination.

Findings

01

Achieved state-of-the-art results on text categorization datasets.

02

Demonstrated that discriminative keywords improve topical inference.

03

Showed that small sets of keywords can effectively represent document topics.

Abstract

Word feature vectors have been proven to improve many NLP tasks. With recent advances in unsupervised learning of these feature vectors, it became possible to train it with much more data, which also resulted in better quality of learned features. Since it learns joint probability of latent features of words, it has the advantage that we can train it without any prior knowledge about the goal task we want to solve. We aim to evaluate the universal applicability property of feature vectors, which has been already proven to hold for many standard NLP tasks like part-of-speech tagging or syntactic parsing. In our case, we want to understand the topical focus of text documents and design an efficient representation suitable for discriminating different topics. The discriminativeness can be evaluated adequately on text categorisation task. We propose a novel method to extract discriminative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.