HiCat: A Semi-Supervised Approach for Cell Type Annotation
Chang Bi, Kailun Bai, Xing Li, Xuekui Zhang

TL;DR
HiCat is a semi-supervised pipeline that combines supervised and unsupervised learning to improve cell type annotation in single-cell RNA sequencing data, especially for identifying novel cell types.
Contribution
It introduces a hybrid approach that fuses reference and query data for enhanced feature learning and transferability, outperforming existing methods on multiple datasets.
Findings
Outperforms existing methods in benchmarking tests
Accurately identifies multiple novel cell types
Enhances transferability of cell type predictions
Abstract
We introduce HiCat (Hybrid Cell Annotation using Transformative embeddings), a novel semi-supervised pipeline for annotating cell types from single-cell RNA sequencing data. HiCat fuses the strengths of supervised learning for known cell types with unsupervised learning to identify novel types. This hybrid approach incorporates both reference and query genomic data for feature engineering, enhancing the embedding learning process, increasing the effective sample size for unsupervised techniques, and improving the transferability of the supervised model trained on reference data when applied to query datasets. The pipeline follows six key steps: (1) removing batch effects using Harmony to generate a 50-dimensional principal component embedding; (2) applying UMAP for dimensionality reduction to two dimensions to capture crucial data patterns; (3) conducting unsupervised clustering of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques
