Loading paper
Vision Foundation Models as Generalist Tokenizers for Image Generation | Tomesphere