Cluster analysis of stocks using price movements of high frequency data from National Stock Exchange
Charu Sharma (Shiv Nadar University, UP), Amber Habib (Shiv Nadar, University, UP), Sunil Bowry (Shiv Nadar University, UP)

TL;DR
This paper introduces a novel approach using Kernel PCA and Functional PCA to analyze high-frequency stock data, revealing sector-based clusters and interactions that can inform intraday trading strategies.
Contribution
It applies KPCA and FPCA to high-frequency NSE data to identify stock clusters and sector interactions, offering new insights beyond traditional correlation analysis.
Findings
Identified two main clusters: banking and IT sectors.
Detected smaller clusters from automobile and energy sectors.
Observed interactions between IT sector and smaller clusters.
Abstract
This paper aims to develop new techniques to describe joint behavior of stocks, beyond regression and correlation. For example, we want to identify the clusters of the stocks that move together. Our work is based on applying Kernel Principal Component Analysis(KPCA) and Functional Principal Component Analysis(FPCA) to high frequency data from NSE. Since we dealt with high frequency data with a tick size of 30 seconds, FPCA seems to be an ideal choice. FPCA is a functional variant of PCA where each sample point is considered to be a function in Hilbert space L^2. On the other hand, KPCA is an extension of PCA using kernel methods. Results obtained from FPCA and Gaussian Kernel PCA seems to be in synergy but with a lag. There were two prominent clusters that showed up in our analysis, one corresponding to the banking sector and another corresponding to the IT sector. The other smaller…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Time Series Analysis · Blind Source Separation Techniques · Neural Networks and Applications
