CNS-Obsidian: A Neurosurgical Vision-Language Model Built From Scientific Publications
Anton Alyakin, Jaden Stryker, Daniel Alexander Alber, Jin Vivian Lee, Karl L. Sangwon, Brandon Duderstadt, Akshay Save, David Kurland, Spencer Frome, Shrutika Singh, Jeff Zhang, Eunice Yang, Ki Yun Park, Cordelia Orillac, Aly A. Valliani, Sean Neifert, Albert Liu, Aneek Patel

TL;DR
CNS-Obsidian is a neurosurgical vision-language model trained on peer-reviewed literature, demonstrating potential for clinical decision support but with performance limitations compared to GPT-4o in real-world neurosurgical consultations.
Contribution
This work introduces CNS-Obsidian, a specialized neurosurgical VLM trained on scientific publications, showcasing its development and evaluation in clinical settings.
Findings
CNS-Obsidian matches GPT-4o on synthetic questions.
It achieves lower accuracy on human-generated questions.
Both models include correct diagnoses in about 60% of cases.
Abstract
General-purpose VLMs demonstrate impressive capabilities, but their opaque training on uncurated internet data poses critical limitations for high-stakes decision-making, such as in neurosurgery. We present CNS-Obsidian, a neurosurgical VLM trained on peer-reviewed literature, and demonstrate its clinical utility versus GPT-4o in a real-world setting. We compiled 23,984 articles from Neurosurgery Publications journals, yielding 78,853 figures and captions. Using GPT-4o and Claude Sonnet-3.5, we converted these into 263,064 training samples across three formats: instruction fine-tuning, multiple-choice questions, and differential diagnosis. We trained CNS-Obsidian, a fine-tune of the 34-billion parameter LLaVA-Next model. In a blinded, randomized trial at NYU Langone Health (Aug 30-Nov 30, 2024), neurosurgery consultations were assigned to either CNS-Obsidian or a HIPAA-compliant GPT-4o…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
