Comparing representations of biological data learned with different AI paradigms, augmenting and cropping strategies
Andrei Dmitrenko, Mauro M. Masiero, Nicola Zamboni

TL;DR
This study compares various AI-based representation learning methods on biological images, revealing that self-supervised models are faster and competitive, with no single strategy excelling across all tasks, guiding future research directions.
Contribution
It systematically evaluates different deep learning approaches, augmenting strategies, and their impact on biological feature extraction from images.
Findings
Self-supervised models train up to 11 times faster.
Multi-crops and random augmentations improve performance.
No single strategy outperforms others across all tasks.
Abstract
Recent advances in computer vision and robotics enabled automated large-scale biological image analysis. Various machine learning approaches have been successfully applied to phenotypic profiling. However, it remains unclear how they compare in terms of biological feature extraction. In this study, we propose a simple CNN architecture and implement 4 different representation learning approaches. We train 16 deep learning setups on the 770k cancer cell images dataset under identical conditions, using different augmenting and cropping strategies. We compare the learned representations by evaluating multiple metrics for each of three downstream tasks: i) distance-based similarity analysis of known drugs, ii) classification of drugs versus controls, iii) clustering within cell lines. We also compare training times and memory usage. Among all tested setups, multi-crops and random…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · Computational Drug Discovery Methods · Genetics, Bioinformatics, and Biomedical Research
