Loading paper
MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment | Tomesphere