Loading paper
SurgVidLM: Towards Multi-grained Surgical Video Understanding with Large Language Model | Tomesphere