Bidirectional Generative Pre-training for Improving Healthcare   Time-series Representation Learning

Ziyang Song; Qincheng Lu; He Zhu; David Buckeridge; Yue Li

arXiv:2402.09558·cs.AI·August 27, 2024·2 cites

Bidirectional Generative Pre-training for Improving Healthcare Time-series Representation Learning

Ziyang Song, Qincheng Lu, He Zhu, David Buckeridge, Yue Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces BiTimelyGPT, a bidirectional pre-training model for healthcare time-series data that improves representation learning and predictive performance by using alternating next and previous token prediction tasks.

Contribution

The paper presents a novel bidirectional pre-training architecture for healthcare time-series data that enhances representation quality and predictive accuracy over existing unidirectional methods.

Findings

01

BiTimelyGPT outperforms existing models in healthcare prediction tasks.

02

The model's attention maps identify key discriminative segments.

03

Bidirectional pre-training improves data distribution preservation.

Abstract

Learning time-series representations for discriminative tasks, such as classification and regression, has been a long-standing challenge in the healthcare domain. Current pre-training methods are limited in either unidirectional next-token prediction or randomly masked token prediction. We propose a novel architecture called Bidirectional Timely Generative Pre-trained Transformer (BiTimelyGPT), which pre-trains on biosignals and longitudinal clinical records by both next-token and previous-token prediction in alternating transformer layers. This pre-training task preserves original distribution and data shapes of the time-series. Additionally, the full-rank forward and backward attention matrices exhibit more expressive representation capabilities. Using biosignals and longitudinal clinical records, BiTimelyGPT demonstrates superior performance in predicting neurological functionality,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

li-lab-mcgill/bitimelygpt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Neural Networks and Applications

MethodsPosition-Wise Feed-Forward Layer · Attention Is All You Need · Dropout · Linear Layer · Dense Connections · Label Smoothing · Absolute Position Encodings · Softmax · Byte Pair Encoding · Multi-Head Attention