EEMC: Embedding Enhanced Multi-tag Classification

Yanlin Li; Shi An; Ruisheng Zhang

arXiv:2009.13826·cs.LG·September 30, 2020

EEMC: Embedding Enhanced Multi-tag Classification

Yanlin Li, Shi An, Ruisheng Zhang

PDF

Open Access

TL;DR

This paper introduces a novel method using representation learning to generate virtual data in a low-dimensional space, significantly enhancing multi-tag classifier performance, especially on small sample datasets.

Contribution

The paper proposes a new approach that creates virtual data through linear operations in representation space to improve classifier accuracy.

Findings

01

Macro F1 score increased by up to 450%

02

Average F1 score increased by up to 224%

03

Virtual data significantly boosts classifier performance

Abstract

The recently occurred representation learning make an attractive performance in NLP and complex network, it is becoming a fundamental technology in machine learning and data mining. How to use representation learning to improve the performance of classifiers is a very significance research direction. We using representation learning technology to map raw data(node of graph) to a low-dimensional feature space. In this space, each raw data obtained a lower dimensional vector representation, we do some simple linear operations for those vectors to produce some virtual data, using those vectors and virtual data to training multi-tag classifier. After that we measured the performance of classifier by F1 score(Macro% F1 and Micro% F1). Our method make Macro F1 rise from 28 % - 450% and make average F1 score rise from 12 % - 224%. By contrast, we trained the classifier directly with the lower…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Domain Adaptation and Few-Shot Learning · Text and Document Classification Technologies