Leak Proof CMap; a framework for training and evaluation of cell line agnostic L1000 similarity methods
Steven Shave, Richard Kasprowicz, Abdullah M. Athar, Denise Vlachou,, Neil O. Carragher, Cuong Q. Nguyen

TL;DR
This paper introduces 'Leak Proof CMap', a standardized benchmarking framework with rigorous data splits for evaluating cell line agnostic phenotypic similarity methods using L1000 transcriptomic data, facilitating personalized medicine and drug discovery.
Contribution
It provides a novel benchmark and data splitting protocol for unbiased evaluation of phenotypic similarity methods across unseen cell lines and treatments.
Findings
Benchmarking across three performance metrics shows improved model robustness.
The framework enables testing on unseen cell lines and treatments.
It supports development of personalized medicine tools.
Abstract
The Connectivity Map (CMap) is a large publicly available database of cellular transcriptomic responses to chemical and genetic perturbations built using a standardized acquisition protocol known as the L1000 technique. Databases such as CMap provide an exciting opportunity to enrich drug discovery efforts, providing a 'known' phenotypic landscape to explore and enabling the development of state of the art techniques for enhanced information extraction and better informed decisions. Whilst multiple methods for measuring phenotypic similarity and interrogating profiles have been developed, the field is severely lacking standardized benchmarks using appropriate data splitting for training and unbiased evaluation of machine learning methods. To address this, we have developed 'Leak Proof CMap' and exemplified its application to a set of common transcriptomic and generic phenotypic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Cell Image Analysis Techniques
MethodsSparse Evolutionary Training
