Modeling Insider Filing Delays in Financial Markets with an Interpretable XGBoost Framework
Cheng Huang, Yao Ma, Fan Gao, Yutong Liu, Yadi Liu, Xiaoli Ma, Ye Aung Moe, Yuhan Zhang, Weizheng Xie, Zeyu Han, Xiangxiang Wang, Hao Wang, Yongbin Yu

TL;DR
This paper develops an interpretable machine learning framework using XGBoost to predict insider filing delays, leveraging a large dataset to improve transparency and regulatory oversight in financial markets.
Contribution
It introduces a hybrid XGBoost-based model with a comprehensive dataset, enhancing interpretability and outperforming existing models in predicting filing delays.
Findings
The framework achieves superior accuracy over statistical and deep learning models.
Insider history and governance signals are key predictors.
The dataset and model serve as a benchmark for regulatory compliance studies.
Abstract
Timely disclosure of insider transactions is a cornerstone of market transparency, yet delays in filing remain widespread and challenging to monitor at scale. This study introduces a comprehensive insider filing delay dataset spanning more than four million Form 4 transactions from 2002 to 2025, enriched with annotations on insider roles, governance attributes, and firm-level indicators. Building on these data, we present a hybrid framework that integrates a state-space encoder with an XGBoost classifier to capture temporal trading patterns while retaining interpretability essential for regulatory auditing. The framework consistently outperforms statistical models, deep sequence learners, and large language model baselines, achieving balanced gains in precision, recall, and F1-score. Feature ablation analyses highlight the predictive importance of insider history, spatiotemporal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Time Series Analysis · Credit Risk and Financial Regulations · Financial Markets and Investment Strategies
