voxel2vec: A Natural Language Processing Approach to Learning Distributed Representations for Scientific Data
Xiangyang He, Yubo Tao, Shuoliu Yang, Haoran Dai, Hai Lin

TL;DR
voxel2vec is an unsupervised NLP-inspired model that learns low-dimensional representations of scalar values in scientific data, enabling analysis of complex spatial and temporal relationships.
Contribution
It introduces voxel2vec, a novel approach applying NLP techniques to learn distributed representations of scientific data features, capturing their contextual similarities.
Findings
Effective in representing scalar-value relationships
Improves feature classification accuracy
Enhances association analysis in scientific datasets
Abstract
Relationships in scientific data, such as the numerical and spatial distribution relations of features in univariate data, the scalar-value combinations' relations in multivariate data, and the association of volumes in time-varying and ensemble data, are intricate and complex. This paper presents voxel2vec, a novel unsupervised representation learning model, which is used to learn distributed representations of scalar values/scalar-value combinations in a low-dimensional vector space. Its basic assumption is that if two scalar values/scalar-value combinations have similar contexts, they usually have high similarity in terms of features. By representing scalar values/scalar-value combinations as symbols, voxel2vec learns the similarity between them in the context of spatial distribution and then allows us to explore the overall association between volumes by transfer prediction. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies
