KerasCV and KerasNLP: Vision and Language Power-Ups
Matthew Watson, Divyashree Shivakumar Sreepathihalli, Francois, Chollet, Martin Gorner, Kiranbir Sodhia, Ramesh Sampath, Tirth Patel, Haifeng, Jin, Neel Kovelamudi, Gabriel Rasskin, Samaneh Saadat, Luke Wood, Chen Qian,, Jonathan Bischof, Ian Stenbit, Abheesht Sharma

TL;DR
KerasCV and KerasNLP are open-source extensions of Keras that facilitate flexible, high-performance computer vision and NLP workflows across multiple frameworks, with pretrained models and efficient training capabilities.
Contribution
Introduction of modular, layered Keras domain packages for vision and language tasks, supporting multiple backends and pretrained models with optimized training pipelines.
Findings
Support for JAX, TensorFlow, PyTorch
Pretrained models for popular architectures
Efficient training with XLA and tf.data
Abstract
We present the Keras domain packages KerasCV and KerasNLP, extensions of the Keras API for Computer Vision and Natural Language Processing workflows, capable of running on either JAX, TensorFlow, or PyTorch. These domain packages are designed to enable fast experimentation, with a focus on ease-of-use and performance. We adopt a modular, layered design: at the library's lowest level of abstraction, we provide building blocks for creating models and data preprocessing pipelines, and at the library's highest level of abstraction, we provide pretrained ``task" models for popular architectures such as Stable Diffusion, YOLOv8, GPT2, BERT, Mistral, CLIP, Gemma, T5, etc. Task models have built-in preprocessing, pretrained weights, and can be fine-tuned on raw inputs. To enable efficient training, we support XLA compilation for all models, and run all preprocessing via a compiled graph of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsGated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · You Only Look Once · Byte Pair Encoding · WordPiece · SentencePiece · Linear Warmup With Linear Decay · Weight Decay · Attention Dropout
