Loading paper
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model | Tomesphere