Building Models of Neurological Language

Henry Watkins

arXiv:2506.06208·cs.CL·June 9, 2025

Building Models of Neurological Language

Henry Watkins

PDF

Open Access

TL;DR

This paper details the development of specialized neurological language models, emphasizing dataset creation, retrieval-augmented techniques, and tools for secure local deployment, with promising initial results and future multimodal directions.

Contribution

It introduces neurology-specific datasets, tools for expression extraction, and graph analyses, adapting to advances in open-source medical language models for secure deployment.

Findings

01

Successful creation of neurology datasets

02

Effective retrieval-augmented generation performance

03

Graph community analysis results

Abstract

This report documents the development and evaluation of domain-specific language models for neurology. Initially focused on building a bespoke model, the project adapted to rapid advances in open-source and commercial medical LLMs, shifting toward leveraging retrieval-augmented generation (RAG) and representational models for secure, local deployment. Key contributions include the creation of neurology-specific datasets (case reports, QA sets, textbook-derived data), tools for multi-word expression extraction, and graph-based analyses of medical terminology. The project also produced scripts and Docker containers for local hosting. Performance metrics and graph community results are reported, with future possible work open for multimodal models using open-source architectures like phi-4.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Machine Learning in Healthcare · Topic Modeling