Singpath-VL Technical Report
Zhen Qiu, Kaiwen Xiao, Zhengwei Lu, Xiangyu Liu, Lei Zhao, Hao Zhang

TL;DR
Singpath-VL is a specialized vision-language large model for cervical cytology, developed using a novel synthetic dataset and fine-tuning strategy, achieving improved cell morphology understanding and diagnostic accuracy.
Contribution
We created a large-scale synthetic dataset and fine-tuned a vision-language model specifically for cervical cytology, addressing data scarcity in computational pathology.
Findings
Superior performance in cell-level diagnostic classification
Enhanced fine-grained morphological perception
Open-sourced synthetic dataset and benchmark
Abstract
We present Singpath-VL, a vision-language large model, to fill the vacancy of AI assistant in cervical cytology. Recent advances in multi-modal large language models (MLLMs) have significantly propelled the field of computational pathology. However, their application in cytopathology, particularly cervical cytology, remains underexplored, primarily due to the scarcity of large-scale, high-quality annotated datasets. To bridge this gap, we first develop a novel three-stage pipeline to synthesize a million-scale image-description dataset. The pipeline leverages multiple general-purpose MLLMs as weak annotators, refines their outputs through consensus fusion and expert knowledge injection, and produces high-fidelity descriptions of cell morphology. Using this dataset, we then fine-tune the Qwen3-VL-4B model via a multi-stage strategy to create a specialized cytopathology MLLM. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Cervical Cancer and HPV Research · COVID-19 diagnosis using AI
