Objectively Evaluating the Reliability of Cell Type Annotation Using LLM-Based Strategies
Wenjin Ye (1), Yuanchen Ma (1, 2), Junkai Xiang (3), Hongjie Liang, (1), Tao Wang (1, 4), Qiuling Xiang (1, 5), Andy Peng Xiang (1, 4, 6), Wu, Song (2), Weiqiang Li (1, 4), Weijun Huang (1, 4) ((1) Center for Stem Cell, Biology, Tissue Engineering, Key Laboratory for Stem Cells

TL;DR
This paper introduces LICT, a novel LLM-based software that improves cell type annotation reliability in single-cell RNA-sequencing by using multi-model fusion and an objective 'talk-to-machine' evaluation strategy.
Contribution
The paper presents LICT, a new multi-model fusion approach with a 'talk-to-machine' strategy for objective reliability assessment in cell type annotation.
Findings
Significantly improved annotation reliability across datasets.
Established objective criteria for reliability assessment.
Enhanced annotation credibility without reference data.
Abstract
Reliability in cell type annotation is challenging in single-cell RNA-sequencing data analysis because both expert-driven and automated methods can be biased or constrained by their training data, especially for novel or rare cell types. Although large language models (LLMs) are useful, our evaluation found that only a few matched expert annotations due to biased data sources and inflexible training inputs. To overcome these limitations, we developed the LICT (Large language model-based Identifier for Cell Types) software package using a multi-model fusion and "talk-to-machine" strategy. Tested across various single-cell RNA sequencing datasets, our approach significantly improved annotation reliability, especially in datasets with low cellular heterogeneity. Notably, we established objective criteria to assess annotation reliability using the "talk-to-machine" approach, which addresses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Imaging for Blood Diseases · Cell Image Analysis Techniques · Biomedical Text Mining and Ontologies
