From Hawkes Processes to Attention: Time-Modulated Mechanisms for Event Sequences
Xinzi Tan, Kejian Zhang, Junhan Yu, Doudou Zhou

TL;DR
This paper introduces Hawkes Attention, a novel attention mechanism inspired by Hawkes process theory, which effectively captures heterogeneous temporal effects in event sequences, outperforming existing Transformer-based methods.
Contribution
The paper proposes Hawkes Attention, a new attention operator that integrates time and content interactions using learnable neural kernels, enhancing modeling of temporal point processes.
Findings
Hawkes Attention outperforms baseline models in MTPP tasks.
The method effectively captures type-specific excitation patterns.
Applicable to time series forecasting with improved results.
Abstract
Marked Temporal Point Processes (MTPPs) arise naturally in medical, social, commercial, and financial domains. However, existing Transformer-based methods mostly inject temporal information only via positional encodings, relying on shared or parametric decay structures, which limits their ability to capture heterogeneous and type-specific temporal effects. Inspired by this observation, we derive a novel attention operator called Hawkes Attention from the multivariate Hawkes process theory for MTPP, using learnable per-type neural kernels to modulate query, key and value projections, thereby replacing the corresponding parts in the traditional attention. Benefited from the design, Hawkes Attention unifies event timing and content interaction, learning both the time-relevant behavior and type-specific excitation patterns from the data. The experimental results show that our method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPoint processes and geometric inequalities · Tensor decomposition and applications · Generative Adversarial Networks and Image Synthesis
