Loading paper
Visually grounded learning of keyword prediction from untranscribed speech | Tomesphere