ConnectomeBench: Can LLMs Proofread the Connectome?

Jeff Brown; Andrew Kirjner; Annika Vivekananthan; Ed Boyden

arXiv:2511.05542·q-bio.NC·November 11, 2025

ConnectomeBench: Can LLMs Proofread the Connectome?

Jeff Brown, Andrew Kirjner, Annika Vivekananthan, Ed Boyden

PDF

Open Access

TL;DR

ConnectomeBench evaluates whether current large language models can automate the proofreading of neural connectome data, showing promising results in certain tasks but still lagging behind human experts.

Contribution

This paper introduces ConnectomeBench, a benchmark for assessing LLMs on connectome proofreading tasks, and provides the first comprehensive evaluation of multiple LLMs on this domain.

Findings

01

LLMs perform well in segment identification (52-82% accuracy).

02

LLMs achieve high accuracy in split error correction (75-85%).

03

Models struggle with merge error detection.

Abstract

Connectomics - the mapping of neural connections in an organism's brain - currently requires extraordinary human effort to proofread the data collected from imaging and machine-learning assisted segmentation. With the growing excitement around using AI agents to automate important scientific tasks, we explore whether current AI systems can perform multiple tasks necessary for data proofreading. We introduce ConnectomeBench, a multimodal benchmark evaluating large language model (LLM) capabilities in three critical proofreading tasks: segment type identification, split error correction, and merge error detection. Using expert annotated data from two large open-source datasets - a cubic millimeter of mouse visual cortex and the complete Drosophila brain - we evaluate proprietary multimodal LLMs including Claude 3.7/4 Sonnet, o4-mini, GPT-4.1, GPT-4o, as well as open source models like…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Ferroelectric and Negative Capacitance Devices · Neurobiology of Language and Bilingualism