Loading paper
Co-training Transformer with Videos and Images Improves Action Recognition | Tomesphere