Loading paper
CoF: Coarse to Fine-Grained Image Understanding for Multi-modal Large Language Models | Tomesphere