An Empirical Evaluation of the t-SNE Algorithm for Data Visualization in   Structural Engineering

Parisa Hajibabaee; Farhad Pourkamali-Anaraki; Mohammad Amin; Hariri-Ardebili

arXiv:2109.08795·cs.LG·September 21, 2021

An Empirical Evaluation of the t-SNE Algorithm for Data Visualization in Structural Engineering

Parisa Hajibabaee, Farhad Pourkamali-Anaraki, Mohammad Amin, Hariri-Ardebili

PDF

TL;DR

This paper evaluates the effectiveness of t-SNE combined with SMOTE for visualizing and classifying imbalanced earthquake engineering data, demonstrating improved neural network classifier performance in 2D space.

Contribution

It provides an empirical assessment of t-SNE and SMOTE for visualizing and classifying high-dimensional, imbalanced data in structural engineering applications.

Findings

01

t-SNE effectively reduces data to 2D for visualization

02

SMOTE improves classifier performance on imbalanced data

03

Neural networks perform well with combined t-SNE and SMOTE

Abstract

A fundamental task in machine learning involves visualizing high-dimensional data sets that arise in high-impact application domains. When considering the context of large imbalanced data, this problem becomes much more challenging. In this paper, the t-Distributed Stochastic Neighbor Embedding (t-SNE) algorithm is used to reduce the dimensions of an earthquake engineering related data set for visualization purposes. Since imbalanced data sets greatly affect the accuracy of classifiers, we employ Synthetic Minority Oversampling Technique (SMOTE) to tackle the imbalanced nature of such data set. We present the result obtained from t-SNE and SMOTE and compare it to the basic approaches with various aspects. Considering four options and six classification algorithms, we show that using t-SNE on the imbalanced data and SMOTE on the training data set, neural network classifiers have…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSynthetic Minority Over-sampling Technique.