Loading paper
Generate to Ground: Multimodal Text Conditioning Boosts Phrase Grounding in Medical Vision-Language Models | Tomesphere