FEAT: A Linear-Complexity Foundation Model for Extremely Large Structured Data
Zhenghang Song, Tang Qian, Lu Chen, Yushuai Li, Zhengke Hu, Bingbing Fang, Yumeng Song, Junbo Zhao, Sheng Zhang, and Tianyi Li

TL;DR
FEAT introduces a scalable, linear-complexity foundation model for large structured data that outperforms existing models in speed and accuracy, addressing key scalability and generalization challenges.
Contribution
The paper proposes FEAT, a novel linear-complexity model with dual-axis encoding and causal pre-training, enabling efficient processing of large structured datasets.
Findings
FEAT achieves up to 50x faster inference latency.
It outperforms existing SFMs on zero-shot tasks across 12 benchmarks.
FEAT scales linearly with data size, supporting extremely large datasets.
Abstract
Structured data is widely used in domains such as healthcare, finance, and scientific data management. Recent studies on structured data foundation models (SFMs) aim to support data analysis and mining tasks over such data, but still face scalability and generalization challenges when applied to real-world enterprise databases. First, many SFMs rely on full self-attention, which introduces an O(N^2) computational bottleneck and limits the number of tuples that can be processed jointly. Second, directly replacing attention with linear-complexity sequence models may conflict with the permutation-invariant nature of structured data, introducing artificial order bias and degrading representation quality. Moreover, models trained only on synthetic data may struggle to generalize to the heavy-tailed and heterogeneous distributions commonly found in real-world databases. To address these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning
