Loading paper
Learning multimodal representations for sample-efficient recognition of human actions | Tomesphere