Neural Network Architecture for Database Augmentation Using Shared Features
William C. Sleeman IV, Rishabh Kapoor, Preetam Ghosh

TL;DR
This paper introduces a neural network architecture designed to augment datasets by leveraging shared features across disparate datasets, facilitating data integration and improving learning in domains with heterogeneous data sources.
Contribution
The proposed architecture enables data augmentation using shared features, addressing challenges of merging non-matching datasets in various domains.
Findings
Effective for image data augmentation
Works with tabular data augmentation
Improves data integration in heterogeneous datasets
Abstract
The popularity of learning from data with machine learning and neural networks has lead to the creation of many new datasets for almost every problem domain. However, even within a single domain, these datasets are often collected with disparate features, sampled from different sub-populations, and recorded at different time points. Even with the plethora of individual datasets, large data science projects can be difficult as it is often not trivial to merge these smaller datasets. Inherent challenges in some domains such as medicine also makes it very difficult to create large single source datasets or multi-source datasets with identical features. Instead of trying to merge these non-matching datasets directly, we propose a neural network architecture that can provide data augmentation using features common between these datasets. Our results show that this style of data augmentation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning
