Persistence Homology of TEDtalk: Do Sentence Embeddings Have a Topological Shape?
Shouman Das, Syed A. Haque, Md. Iftekhar Tanveer

TL;DR
This study explored whether topological data analysis of sentence embeddings from TEDtalks could enhance public speaking rating models, but found that it does not significantly improve accuracy and may sometimes worsen it.
Contribution
The paper applies topological data analysis to sentence embeddings for public speaking classification, revealing its limited effectiveness in this context.
Findings
Topological features did not significantly improve classification accuracy.
In some cases, topological features slightly decreased model performance.
TDA may not be beneficial for sentence embedding analysis in public speaking tasks.
Abstract
\emph{Topological data analysis} (TDA) has recently emerged as a new technique to extract meaningful discriminitve features from high dimensional data. In this paper, we investigate the possibility of applying TDA to improve the classification accuracy of public speaking rating. We calculated \emph{persistence image vectors} for the sentence embeddings of TEDtalk data and feed this vectors as additional inputs to our machine learning models. We have found a negative result that this topological information does not improve the model accuracy significantly. In some cases, it makes the accuracy slightly worse than the original one. From our results, we could not conclude that the topological shapes of the sentence embeddings can help us train a better model for public speaking rating.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Homotopy and Cohomology in Algebraic Topology · Advanced Neuroimaging Techniques and Applications
