Loading paper
An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding | Tomesphere