Building Models of Neurological Language
Henry Watkins

TL;DR
This paper details the development of specialized neurological language models, emphasizing dataset creation, retrieval-augmented techniques, and tools for secure local deployment, with promising initial results and future multimodal directions.
Contribution
It introduces neurology-specific datasets, tools for expression extraction, and graph analyses, adapting to advances in open-source medical language models for secure deployment.
Findings
Successful creation of neurology datasets
Effective retrieval-augmented generation performance
Graph community analysis results
Abstract
This report documents the development and evaluation of domain-specific language models for neurology. Initially focused on building a bespoke model, the project adapted to rapid advances in open-source and commercial medical LLMs, shifting toward leveraging retrieval-augmented generation (RAG) and representational models for secure, local deployment. Key contributions include the creation of neurology-specific datasets (case reports, QA sets, textbook-derived data), tools for multi-word expression extraction, and graph-based analyses of medical terminology. The project also produced scripts and Docker containers for local hosting. Performance metrics and graph community results are reported, with future possible work open for multimodal models using open-source architectures like phi-4.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Machine Learning in Healthcare · Topic Modeling
