SCENIC: A JAX Library for Computer Vision Research and Beyond
Mostafa Dehghani, Alexey Gritsenko, Anurag Arnab, Matthias, Minderer, Yi Tay

TL;DR
Scenic is an open-source JAX library designed to accelerate research and development of Transformer-based models for computer vision and multi-modal tasks, supporting rapid experimentation and large-scale training.
Contribution
It introduces a versatile, optimized JAX toolkit that simplifies prototyping, supports multiple vision tasks, and enables multi-device training for cutting-edge research.
Findings
Supported diverse vision tasks including classification, segmentation, detection
Facilitated rapid experimentation and prototyping of new models
Enabled large-scale multi-device training on GPU/TPU
Abstract
Scenic is an open-source JAX library with a focus on Transformer-based models for computer vision research and beyond. The goal of this toolkit is to facilitate rapid experimentation, prototyping, and research of new vision architectures and models. Scenic supports a diverse range of vision tasks (e.g., classification, segmentation, detection)and facilitates working on multi-modal problems, along with GPU/TPU support for multi-host, multi-device large-scale training. Scenic also offers optimized implementations of state-of-the-art research models spanning a wide range of modalities. Scenic has been successfully used for numerous projects and published papers and continues serving as the library of choice for quick prototyping and publication of new research ideas.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Human Pose and Action Recognition
