Transformer Hawkes Process
Simiao Zuo, Haoming Jiang, Zichong Li, Tuo Zhao, Hongyuan Zha

TL;DR
The paper introduces the Transformer Hawkes Process (THP), a model that uses self-attention to effectively capture long-term dependencies in event sequence data, outperforming existing models in prediction accuracy.
Contribution
The paper presents a novel Transformer-based point process model that improves long-term dependency modeling and prediction performance over traditional recurrent neural network approaches.
Findings
THP outperforms existing models in likelihood and event prediction accuracy.
THP effectively captures long-term dependencies in event sequences.
Incorporating structural knowledge enhances THP's performance in learning multiple point processes.
Abstract
Modern data acquisition routinely produce massive amounts of event sequence data in various domains, such as social media, healthcare, and financial markets. These data often exhibit complicated short-term and long-term temporal dependencies. However, most of the existing recurrent neural network based point process models fail to capture such dependencies, and yield unreliable prediction performance. To address this issue, we propose a Transformer Hawkes Process (THP) model, which leverages the self-attention mechanism to capture long-term dependencies and meanwhile enjoys computational efficiency. Numerical experiments on various datasets show that THP outperforms existing models in terms of both likelihood and event prediction accuracy by a notable margin. Moreover, THP is quite general and can incorporate additional structural knowledge. We provide a concrete example, where THP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPoint processes and geometric inequalities · Diffusion and Search Dynamics · Morphological variations and asymmetry
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax
