Loading paper
HiSpatial: Taming Hierarchical 3D Spatial Understanding in Vision-Language Models | Tomesphere