Loading paper
Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation | Tomesphere