Loading paper
Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval | Tomesphere