Cytoarchitecture in Words: Weakly Supervised Vision-Language Modeling for Human Brain Microscopy
Matthew Sutton, Katrin Amunts, Timo Dickscheid, Christian Schiffer

TL;DR
This paper introduces a weakly supervised method to generate natural language descriptions of human brain microscopy images by linking images to literature-derived labels, enabling vision-language modeling without paired data.
Contribution
It proposes a label-mediated approach that automatically creates captions from literature, coupling a vision model with a language model for brain microscopy analysis.
Findings
Achieves 90.6% accuracy in matching brain areas to labels.
Descriptions can recover brain areas with 68.6% accuracy in 8-way tests.
Supports open-set use by rejecting unseen areas.
Abstract
Foundation models increasingly offer potential to support interactive, agentic workflows that assist researchers during analysis and interpretation of image data. Such workflows often require coupling vision to language to provide a natural-language interface. However, paired image-text data needed to learn this coupling are scarce and difficult to obtain in many research and clinical settings. One such setting is microscopic analysis of cell-body-stained histological human brain sections, which enables the study of cytoarchitecture: cell density and morphology and their laminar and areal organization. Here, we propose a label-mediated method that generates meaningful captions from images by linking images and text only through a label, without requiring curated paired image-text data. Given the label, we automatically mine area descriptions from related literature and use them as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · Multimodal Machine Learning Applications · Biomedical Text Mining and Ontologies
