Embedding-Aware Feature Discovery: Bridging Latent Representations and Interpretable Features in Event Sequences
Artem Sakhno, Ivan Sergeev, Alexey Shestov, Omar Zoloev, Elizaveta Kovtun, Gleb Gusev, Andrey Savchenko, Maksim Makarenko

TL;DR
This paper introduces EAFD, a framework that combines pretrained embeddings with feature discovery to improve interpretability and predictive performance in event sequence analysis, outperforming existing methods.
Contribution
EAFD is a novel unified approach that couples embeddings with an LLM-driven feature generator, bridging the gap between learned representations and interpretable features.
Findings
EAFD outperforms embedding-only baselines by up to 5.8%.
Achieves state-of-the-art results on event-sequence datasets.
Demonstrates robustness in industrial transaction benchmarks.
Abstract
Industrial financial systems operate on temporal event sequences such as transactions, user actions, and system logs. While recent research emphasizes representation learning and large language models, production systems continue to rely heavily on handcrafted statistical features due to their interpretability, robustness under limited supervision, and strict latency constraints. This creates a persistent disconnect between learned embeddings and feature-based pipelines. We introduce Embedding-Aware Feature Discovery (EAFD), a unified framework that bridges this gap by coupling pretrained event-sequence embeddings with a self-reflective LLM-driven feature generation agent. EAFD iteratively discovers, evaluates, and refines features directly from raw event sequences using two complementary criteria: \emph{alignment}, which explains information already encoded in embeddings, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Data Quality and Management · Time Series Analysis and Forecasting
