TL;DR
This paper introduces a bias correction method for factor pipelines in Chinese A-share markets, improving trading performance by preventing non-tradable prices from contaminating data and enhancing model accuracy.
Contribution
It presents a mask-first data processing approach combined with GPU acceleration, novel loss functions, and portfolio optimization techniques, with an open-source implementation.
Findings
Mask contract significantly improves Sharpe ratio (+0.44)
System achieves annualised Sharpe 2.05 on synthetic data
Real data results in Sharpe 1.63 with the proposed system
Abstract
Rolling-window factor pipelines for Chinese A-share markets contain a subtle but costly flaw: daily price-move limits (+/-10% main-board, +/-20% STAR/ChiNext) render a fraction of closing prices non-executable, yet standard implementations ingest these values before any row-filtering runs. The contaminated aggregates propagate silently through moving averages, correlations, and ranks--a failure mode we term "upstream contamination". On real A-share data it inflates apparent information coefficient by 18% while reducing realised Sharpe by 0.44 points, because the model learns to predict returns it cannot trade. We resolve this with a mask-first design: a Boolean tradability mask is constructed at data load time and threaded through every operator, so that no window ever reads a non-tradable price. Built on this foundation, the system adds (i) a GPU-vectorised 213-factor engine via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
