Text-mining the NeuroSynth corpus using Deep Boltzmann Machines
Ricardo Pio Monti, Romy Lorenz, Robert Leech, Christoforos, Anagnostopoulos, Giovanni Montana

TL;DR
This paper applies Deep Boltzmann Machines to analyze the NeuroSynth neuroimaging text corpus, producing meaningful embeddings that enhance traditional text analysis methods and improve understanding of brain function research.
Contribution
It introduces an unsupervised deep learning approach using DBMs for neuroimaging text mining, providing high-dimensional semantic embeddings that outperform previous methods.
Findings
DBMs produce semantically meaningful embeddings
Embeddings facilitate machine learning on neuroimaging texts
Improved understanding of neuroimaging literature structure
Abstract
Large-scale automated meta-analysis of neuroimaging data has recently established itself as an important tool in advancing our understanding of human brain function. This research has been pioneered by NeuroSynth, a database collecting both brain activation coordinates and associated text across a large cohort of neuroimaging research papers. One of the fundamental aspects of such meta-analysis is text-mining. To date, word counts and more sophisticated methods such as Latent Dirichlet Allocation have been proposed. In this work we present an unsupervised study of the NeuroSynth text corpus using Deep Boltzmann Machines (DBMs). The use of DBMs yields several advantages over the aforementioned methods, principal among which is the fact that it yields both word and document embeddings in a high-dimensional vector space. Such embeddings serve to facilitate the use of traditional machine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
