LOBERT: Generative AI Foundation Model for Limit Order Book Messages
Eljas Linna, Kestutis Baltakys, Alexandros Iosifidis, Juho Kanniainen

TL;DR
LOBERT is a versatile foundation model tailored for financial limit order book data, leveraging a novel tokenization scheme to improve prediction accuracy and efficiency in high-frequency trading tasks.
Contribution
Introduces LOBERT, a BERT-based encoder model with a new tokenization scheme for LOB data, enabling better adaptability and performance in downstream financial tasks.
Findings
Achieves state-of-the-art results in mid-price movement prediction.
Reduces context length needed for accurate predictions.
Demonstrates versatility across multiple LOB-related tasks.
Abstract
Modeling the dynamics of financial Limit Order Books (LOB) at the message level is challenging due to irregular event timing, rapid regime shifts, and the reactions of high-frequency traders to visible order flow. Previous LOB models require cumbersome data representations and lack adaptability outside their original tasks, leading us to introduce LOBERT, a general-purpose encoder-only foundation model for LOB data suitable for downstream fine-tuning. LOBERT adapts the original BERT architecture for LOB data by using a novel tokenization scheme that treats complete multi-dimensional messages as single tokens while retaining continuous representations of price, volume, and time. With these methods, LOBERT achieves leading performance in tasks such as predicting mid-price movements and next messages, while reducing the required context length compared to previous methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Time Series Analysis · Stock Market Forecasting Methods · Financial Markets and Investment Strategies
