TL;DR
This paper introduces two neural network architectures inspired by BERT and GPT for modeling multivariate time series in tabular datasets, enabling representation learning, downstream tasks, and synthetic data generation.
Contribution
It proposes novel Transformer-based models tailored for hierarchical tabular time series, supporting both representation learning and synthetic sequence generation.
Findings
Effective fraud detection on synthetic credit card data
Accurate atmospheric pollutant prediction
Successful synthetic data generation for tabular sequences
Abstract
Tabular datasets are ubiquitous in data science applications. Given their importance, it seems natural to apply state-of-the-art deep learning algorithms in order to fully unlock their potential. Here we propose neural network models that represent tabular time series that can optionally leverage their hierarchical structure. This results in two architectures for tabular time series: one for learning representations that is analogous to BERT and can be pre-trained end-to-end and used in downstream tasks, and one that is akin to GPT and can be used for generation of realistic synthetic tabular sequences. We demonstrate our models on two datasets: a synthetic credit card transaction dataset, where the learned representations are used for fraud detection and synthetic data generation, and on a real pollution dataset, where the learned encodings are used to predict atmospheric pollutant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Cosine Annealing · Linear Warmup With Cosine Annealing · Discriminative Fine-Tuning · Byte Pair Encoding · Softmax · Dense Connections · WordPiece · GPT · Linear Warmup With Linear Decay
