Random Controlled Differential Equations
Francesco Piatti, Thomas Cass, William F. Turner

TL;DR
This paper introduces a scalable, efficient framework for time-series learning using random features combined with controlled differential equations, achieving state-of-the-art results while reducing computational complexity.
Contribution
It presents novel variants of CDEs with random features, connecting them to kernel methods and signature theory, and demonstrates their effectiveness on benchmarks.
Findings
Achieves competitive or state-of-the-art performance on time-series benchmarks.
Provides a unified theoretical perspective linking random features, kernels, and path signatures.
Offers practical, scalable models that retain inductive biases of signature methods.
Abstract
We introduce a training-efficient framework for time-series learning that combines random features with controlled differential equations (CDEs). In this approach, large randomly parameterized CDEs act as continuous-time reservoirs, mapping input paths to rich representations. Only a linear readout layer is trained, resulting in fast, scalable models with strong inductive bias. Building on this foundation, we propose two variants: (i) Random Fourier CDEs (RF-CDEs): these lift the input signal using random Fourier features prior to the dynamics, providing a kernel-free approximation of RBF-enhanced sequence models; (ii) Random Rough DEs (R-RDEs): these operate directly on rough-path inputs via a log-ODE discretization, using log-signatures to capture higher-order temporal interactions while remaining stable and efficient. We prove that in the infinite-width limit, these model induces the…
Peer Reviews
Decision·ICLR 2026 Poster
- The paper bridges two compelling areas: reservoir computing and the infinite-width limit of neural CDEs. To me, this connection seems novel. - The work is technically sound and has solid theoretical grounding. It proves asymptotic kernel equivalence and establishes existence and uniqueness of the limiting dynamics. - The authors provide a computational complexity analysis; the proposed linear-readout-only training offers efficiency gains. - Although the paper is primarily theoretical, it inc
I do not identify any critical drawbacks in this paper. However, - The experimental section focuses mainly on ablations of the proposed models and their variants. This improves completeness, but practitioners may be unsure about the broader practical advantages. - In particular, while the motivation is framed in terms of neural CDEs + PRC, the paper does not compare against neural CDEs or other end-to-end trained deep learning architectures.
1. Clean theoretical framing with meaningful limits. RF-CDE -> RBF-lifted signature kernel and R-RDE -> rough signature kernel in infinite width; the results connect random differential equations to signature kernels in a principled way. 2. Train only a linear head on top of fixed random dynamic, this is simple and fast to fit in practice 3. Clear accounting for feature-extraction costs; random-feature models scale linearly in sequence length l.
1. The main theorems characterize infinite-width limits. There are no non-asymptotic approximation or generalization bounds to explain when a few hundred features suffice 2. All evaluations are classification on UEA (16 datasets). No forecasting, imputation, irregular sampling, or robustness to noise/missingness. 3. While RF-CDE averages best among random-feature models, R-RDE trails (avg. 0.708). Some difficult datasets (EigenWorms, Handwriting) show large gaps to SigPDE/RFSF. 4. Ablations are
- The writing is high-quality and mathematically literate. The background sections on rough paths, signatures, and kernels are compact, accurate, and helpful for positioning the work. - The paper cleanly recalls the R-CDE limit to the signature kernel and extends it with two variants whose limits are the RBF-lifted signature kernel (RF-CDE) and the rough signature kernel (R-RDE). The statements (Theorems 3.2 and 3.4) are explicit. - The paper correctly frames the models as training-efficient r
- The main theorems are in the infinite-width setting. The paper does not provide non-asymptotic approximation rates or generalization/error bounds for practical feature counts. - Only UEA classification is considered. There are no forecasting or irregular sampling experiments, despite the continuous-time claim. The asymptotic table is helpful, but there are no runtime or memory measurements on the UEA suite to validate the linear-in-length advantage or to quantify the cubic term in R-RDE. -
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Reservoir Computing · Generative Adversarial Networks and Image Synthesis
