Loading paper
R-VLM: Region-Aware Vision Language Model for Precise GUI Grounding | Tomesphere