Loading paper
Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning | Tomesphere