Loading paper
Leveraging Vision-Language Pre-training for Human Activity Recognition in Still Images | Tomesphere