Loading paper
G$^2$TR: Generation-Guided Visual Token Reduction for Separate-Encoder Unified Multimodal Models | Tomesphere