Loading paper
A Multimodal Framework for Aligning Human Linguistic Descriptions with Visual Perceptual Data | Tomesphere