A New Projection Pursuit Index for Big Data
Yajie Duan, Javier Cabrera, Birol Emir

TL;DR
This paper introduces a novel Projection Pursuit index designed for big data visualization, enabling the detection of hidden structures like clusters and outliers through a combination of Guided Tour and Data Nuggets methods.
Contribution
A new computationally feasible PP index for big data is proposed, integrating Data Nuggets for data compression to facilitate visualization of complex structures.
Findings
Effective detection of nonlinear structures in large datasets
Simulation and real data demonstrate the index's utility
Enables static and dynamic visualization tools for big data
Abstract
Visualization of extremely large datasets in static or dynamic form is a huge challenge because most traditional methods cannot deal with big data problems. A new visualization method for big data is proposed based on Projection Pursuit, Guided Tour and Data Nuggets methods, that will help display interesting hidden structures such as clusters, outliers, and other nonlinear structures in big data. The Guided Tour is a dynamic graphical tool for high-dimensional data combining Projection Pursuit and Grand Tour methods. It displays a dynamic sequence of low-dimensional projections obtained by using Projection Pursuit (PP) index functions to navigate the data space. Different PP indices have been developed to detect interesting structures of multivariate data but there are computational problems for big data using the original guided tour with these indices. A new PP index is developed to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Time Series Analysis and Forecasting · Advanced Statistical Methods and Models
