Loading paper
Object Affordance Recognition and Grounding via Multi-scale Cross-modal Representation Learning | Tomesphere