Conditional Similarity Triplets Enable Covariate-Informed Representations of Single-Cell Data
Chi-Jane Chen, Haidong Yi, and Natalie Stanley

TL;DR
This paper introduces CytoCoSet, a set-based encoding method that incorporates clinical covariates into single-cell data representations, improving the prediction of clinical outcomes by considering additional relevant features.
Contribution
The paper presents a novel covariate-informed encoding approach for single-cell data, enhancing predictive accuracy by integrating clinical covariates into the representation learning process.
Findings
CytoCoSet improves clinical outcome prediction accuracy.
Incorporating covariates leads to more informative sample representations.
The method effectively aligns similar covariate samples in embedding space.
Abstract
Single-cell technologies enable comprehensive profiling of diverse immune cell-types through the measurement of multiple genes or proteins per individual cell. In order to translate immune signatures assayed from blood or tissue into powerful diagnostics, machine learning approaches are often employed to compute immunological summaries or per-sample featurizations, which can be used as inputs to models for outcomes of interest. Current supervised learning approaches for computing per-sample representations are trained only to accurately predict a single outcome and do not take into account relevant additional clinical features or covariates that are likely to also be measured for each sample. Here, we introduce a novel approach for incorporating measured covariates in optimizing model parameters to ultimately specify per-sample encodings that accurately affect both immune signatures and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Gene expression and cancer classification
