Self-Supervised Contrastive Pre-Training for Multivariate Point Processes
Xiao Shou, Dharmashankar Subramanian, Debarun Bhattacharjya, Tian Gao,, Kristin P. Bennet

TL;DR
This paper introduces a novel self-supervised contrastive pre-training method for multivariate point processes using a transformer, improving next-event prediction accuracy by up to 20% on synthetic and real datasets.
Contribution
It proposes a new self-supervised pre-training strategy with masking and void insertion for continuous-time event data, enhancing downstream predictive performance.
Findings
Up to 20% performance improvement on next-event prediction
Effective pre-training with masking and void insertion strategies
Validated on synthetic and real-world datasets
Abstract
Self-supervision is one of the hallmarks of representation learning in the increasingly popular suite of foundation models including large language models such as BERT and GPT-3, but it has not been pursued in the context of multivariate event streams, to the best of our knowledge. We introduce a new paradigm for self-supervised learning for multivariate point processes using a transformer encoder. Specifically, we design a novel pre-training strategy for the encoder where we not only mask random event epochs but also insert randomly sampled "void" epochs where an event does not occur; this differs from the typical discrete-time pretext tasks such as word-masking in BERT but expands the effectiveness of masking to better capture continuous-time dynamics. To improve downstream tasks, we introduce a contrasting module that compares real events to simulated void instances. The pre-trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPoint processes and geometric inequalities
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Linear Layer · Byte Pair Encoding · Residual Connection · Linear Warmup With Cosine Annealing · Dense Connections · WordPiece · Dropout
