Loading paper
Referring Multiple Regions with Large Multimodal Models via Contextual Latent Steering | Tomesphere