Loading paper
VideoCap-R1: Enhancing MLLMs for Video Captioning via Structured Thinking | Tomesphere