Joining datasets via data augmentation in the label space for neural   networks

Jake Zhao (Junbo); Mingfeng Ou; Linji Xue; Yunkai Cui; Sai Wu; Gang; Chen

arXiv:2106.09260·cs.LG·June 18, 2021

Joining datasets via data augmentation in the label space for neural networks

Jake Zhao (Junbo), Mingfeng Ou, Linji Xue, Yunkai Cui, Sai Wu, Gang, Chen

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel method for joining datasets in neural network training by augmenting in the label space using knowledge graphs and reinforcement learning, addressing label discrepancies.

Contribution

It presents a new label space augmentation technique leveraging knowledge graphs, RNNs, and policy gradients for dataset integration in neural networks.

Findings

01

Effective on image classification tasks

02

Validates approach on text classification

03

Improves dataset utilization in neural training

Abstract

Most, if not all, modern deep learning systems restrict themselves to a single dataset for neural network training and inference. In this article, we are interested in systematic ways to join datasets that are made of similar purposes. Unlike previous published works that ubiquitously conduct the dataset joining in the uninterpretable latent vectorial space, the core to our method is an augmentation procedure in the label space. The primary challenge to address the label space for dataset joining is the discrepancy between labels: non-overlapping label annotation sets, different labeling granularity or hierarchy and etc. Notably we propose a new technique leveraging artificially created knowledge graph, recurrent neural networks and policy gradient that successfully achieve the dataset joining in the label space. Empirical results on both image and text classification justify the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Joining datasets via data augmentation in the label space for neural networks· slideslive

Taxonomy

TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Advanced Graph Neural Networks