Parsimonious Feature Extraction Methods: Extending Robust Probabilistic Projections with Generalized Skew-t
Dorota Toczydlowska, Gareth W. Peters, Pavel V. Shevchenko

TL;DR
This paper introduces a flexible, robust feature extraction framework based on generalized skew-t distributions, capable of modeling asymmetric data, tail dependence, and handling missing values, demonstrated on cryptocurrency data.
Contribution
It extends Student-t probabilistic PCA to account for asymmetry, group structures, and separates tail effects, offering a more versatile feature extraction method.
Findings
Effective modeling of asymmetric and tail-dependent data.
Handles missing data efficiently in feature extraction.
Applied successfully to cryptocurrency market data.
Abstract
We propose a novel generalisation to the Student-t Probabilistic Principal Component methodology which: (1) accounts for an asymmetric distribution of the observation data; (2) is a framework for grouped and generalised multiple-degree-of-freedom structures, which provides a more flexible approach to modelling groups of marginal tail dependence in the observation data; and (3) separates the tail effect of the error terms and factors. The new feature extraction methods are derived in an incomplete data setting to efficiently handle the presence of missing values in the observation vector. We discuss various special cases of the algorithm being a result of simplified assumptions on the process generating the data. The applicability of the new framework is illustrated on a data set that consists of crypto currencies with the highest market capitalisation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
