voxel2vec: A Natural Language Processing Approach to Learning   Distributed Representations for Scientific Data

Xiangyang He; Yubo Tao; Shuoliu Yang; Haoran Dai; Hai Lin

arXiv:2207.02565·cs.LG·July 25, 2022

voxel2vec: A Natural Language Processing Approach to Learning Distributed Representations for Scientific Data

Xiangyang He, Yubo Tao, Shuoliu Yang, Haoran Dai, Hai Lin

PDF

Open Access

TL;DR

voxel2vec is an unsupervised NLP-inspired model that learns low-dimensional representations of scalar values in scientific data, enabling analysis of complex spatial and temporal relationships.

Contribution

It introduces voxel2vec, a novel approach applying NLP techniques to learn distributed representations of scientific data features, capturing their contextual similarities.

Findings

01

Effective in representing scalar-value relationships

02

Improves feature classification accuracy

03

Enhances association analysis in scientific datasets

Abstract

Relationships in scientific data, such as the numerical and spatial distribution relations of features in univariate data, the scalar-value combinations' relations in multivariate data, and the association of volumes in time-varying and ensemble data, are intricate and complex. This paper presents voxel2vec, a novel unsupervised representation learning model, which is used to learn distributed representations of scalar values/scalar-value combinations in a low-dimensional vector space. Its basic assumption is that if two scalar values/scalar-value combinations have similar contexts, they usually have high similarity in terms of features. By representing scalar values/scalar-value combinations as symbols, voxel2vec learns the similarity between them in the context of spatial distribution and then allows us to explore the overall association between volumes by transfer prediction. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies