Loading paper
HPE-CogVLM: Advancing Vision Language Models with a Head Pose Grounding Task | Tomesphere