Categorical Exploratory Data Analysis: From Multiclass Classification and Response Manifold Analytics perspectives of baseball pitching dynamics
Fushing Hsieh, Elizabeth P. Chou

TL;DR
This paper introduces Categorical Exploratory Data Analysis (CEDA) for analyzing MLB pitching dynamics, integrating multiclass classification and manifold analytics to reveal geometric patterns and uncertainty in machine learning models.
Contribution
It develops a novel CEDA framework combining MCC and RMA perspectives, uncovering geometric structures and localities in pitching data for improved understanding and inference.
Findings
Identifies asymmetry in label mixing geometries.
Reveals multi-order geometric pattern categories.
Demonstrates physical principles through manifold localities.
Abstract
From two coupled Multiclass Classification (MCC) and Response Manifold Analytics (RMA) perspectives, we develop Categorical Exploratory Data Analysis (CEDA) on PITCHf/x database for the information content of Major League Baseball's (MLB) pitching dynamics. MCC and RMA information contents are represented by one collection of multi-scales pattern categories from mixing geometries and one collection of global-to-local geometric localities from response-covariate manifolds, respectively. These collectives shed light on the pitching dynamics and maps out uncertainty of popular machine learning approaches. On MCC setting, an indirect-distance-measure based label embedding tree leads to discover asymmetry of mixing geometries among labels' point-clouds. A selected chain of complementary covariate feature groups collectively brings out multi-order mixing geometric pattern categories. Such…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
