Loading paper
Unsupervised vs. transfer learning for multimodal one-shot matching of speech and images | Tomesphere