Using PCA and Factor Analysis for Dimensionality Reduction of   Bio-informatics Data

M. Usman Ali; Shahzad Ahmed; Javed Ferzund; Atif Mehmood; Abbas Rehman

arXiv:1707.07189·q-bio.OT·July 25, 2017

Using PCA and Factor Analysis for Dimensionality Reduction of Bio-informatics Data

M. Usman Ali, Shahzad Ahmed, Javed Ferzund, Atif Mehmood, Abbas Rehman

PDF

TL;DR

This paper explores the application of PCA and Factor Analysis to reduce the dimensionality of high-dimensional bioinformatics data, specifically leukemia datasets, to improve the efficiency of data analysis techniques.

Contribution

It demonstrates the effectiveness of PCA and Factor Analysis in reducing attributes of bioinformatics data, facilitating better data analysis performance.

Findings

01

Attributes reduced significantly, improving analysis efficiency

02

PCA and Factor Analysis effectively identify key features

03

Enhanced data interpretability for bioinformatics applications

Abstract

Large volume of Genomics data is produced on daily basis due to the advancement in sequencing technology. This data is of no value if it is not properly analysed. Different kinds of analytics are required to extract useful information from this raw data. Classification, Prediction, Clustering and Pattern Extraction are useful techniques of data mining. These techniques require appropriate selection of attributes of data for getting accurate results. However, Bioinformatics data is high dimensional, usually having hundreds of attributes. Such large a number of attributes affect the performance of machine learning algorithms used for classification/prediction. So, dimensionality reduction techniques are required to reduce the number of attributes that can be further used for analysis. In this paper, Principal Component Analysis and Factor Analysis are used for dimensionality reduction of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.