Loading paper
Can Vision Language Models Judge Action Quality? An Empirical Evaluation | Tomesphere