Loading paper
Empirical Recipes for Efficient and Compact Vision-Language Models | Tomesphere