LLaVaOLMoBitnet1B: Ternary LLM goes Multimodal!

Jainaveen Sundaram; Ravi Iyer

arXiv:2408.13402·cs.LG·September 4, 2024

LLaVaOLMoBitnet1B: Ternary LLM goes Multimodal!

Jainaveen Sundaram, Ravi Iyer

PDF

Open Access 1 Models

TL;DR

This paper introduces LLaVaOLMoBitnet1B, a fully open-source ternary multimodal large language model capable of processing images and text, aiming to democratize AI with efficient, accessible multimodal capabilities.

Contribution

It presents the first ternary multimodal LLM that accepts image and text inputs, along with open-source training scripts to foster further research.

Findings

01

Successfully trained a ternary multimodal LLM with competitive performance.

02

Demonstrated the model's ability to handle image and text inputs coherently.

03

Provided open-source resources to accelerate multimodal AI research.

Abstract

Multimodal Large Language Models (MM-LLMs) have seen significant advancements in the last year, demonstrating impressive performance across tasks. However, to truly democratize AI, models must exhibit strong capabilities and be able to run efficiently on small compute footprints accessible by most. Part of this quest, we introduce LLaVaOLMoBitnet1B - the first Ternary Multimodal LLM capable of accepting Image(s)+Text inputs to produce coherent textual responses. The model is fully open-sourced along with training scripts to encourage further research in this space. This accompanying technical report highlights the training process, evaluation details, challenges associated with ternary models and future opportunities. Link to the model: https://huggingface.co/IntelLabs/LlavaOLMoBitnet1B

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
IntelLabs/LlavaOLMoBitnet1B
model· 23 dl· ♡ 29
23 dl♡ 29

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling