Loading paper
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations | Tomesphere