Synthetic Data Generation for Classifying Electrophysiological and Morpho-Electrophysiological Neurons from Mouse Visual Cortex
Xavier Vasques, Laura Cif

TL;DR
This study evaluates classical and deep generative synthetic data augmentation methods for classifying neuron types in the mouse visual cortex, finding SMOTE most effective and highlighting the importance of data quality and model tuning.
Contribution
It benchmarks various synthetic data generation techniques for neuronal classification, revealing SMOTE's superior performance and providing insights into model fidelity and optimization.
Findings
SMOTE achieves highest classification accuracy improvements.
GANs perform well with optimized hyperparameters.
Synthetic data fidelity is comparable to natural phenotypic variability.
Abstract
The accurate classification of neuronal cell types is central to decoding brain function, yet remains hindered by data scarcity and cellular heterogeneity. Here, we benchmarked classical and deep generative synthetic data augmentation strategies -- including SMOTE, GANs, VAEs, Normalizing Flows, and DDPMs -- for supervised classification of both electrophysiological (e-type) and morpho-electrophysiological (mee-type) neuron types from the mouse visual cortex. Using a curated dataset annotated with 48 electrophysiological and 24 morphological features, we established baseline classifiers and introduced synthetic data generated by each method. Our results demonstrate that SMOTE-based augmentation yields the highest classification accuracies (absolute gains of 0.16 for e-types, 0.12 for mee-types), outperforming deep generative models. GANs approached similar performance when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
