Diagnostic Certificates of Data Quality and Regression Identifiability for Koopman Identification
Yue Wu

TL;DR
This paper introduces a diagnostic framework for assessing data quality and regression identifiability in Koopman-based control systems, addressing issues like state coverage, feature degeneracy, and regression stability.
Contribution
It develops theoretical certificates that diagnose failures in data quality for Koopman regression, with practical scoring methods and experimental validation on dynamical systems.
Findings
Certificates effectively identify data quality issues in Koopman regression.
Experimental results show the importance of multiple diagnostics for prediction and control.
The framework guides data collection and interpretation, not universal optimality.
Abstract
Classical persistent excitation criteria usually assess whether an input or regressor signal is sufficiently rich. In Koopman and EDMD with control (EDMDc), however, data quality is determined by the concatenation of lifted state features and control inputs. Input-rich data can still visit a narrow state region, well-spread state samples can still produce degenerate lifted features, and both can fail to condition the final regression problem. This paper develops a diagnostic certificate framework for locating these failures. The certificates separate state-space coverage and clustering, lifted-feature nondegeneracy, and the final regression spectrum. The regression-spectrum certificate is the layer with direct theoretical guarantees: it controls the active standardized design's smallest singular value, has Fisher-information and one-step EDMDc stability interpretations, and admits a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
