Char-SAM: Turning Segment Anything Model into Scene Text Segmentation Annotator with Character-level Visual Prompts
Enze Xie, Jiaho Lyu, Daiqing Wu, Huawen Shen, Yu Zhou

TL;DR
Char-SAM is a novel pipeline that transforms the Segment Anything Model into an automatic, character-level scene text segmentation annotator using visual prompts, improving accuracy and reducing annotation costs.
Contribution
It introduces a character-level prompt refinement framework that enhances SAM's scene text segmentation performance without additional training.
Findings
Effective in generating high-quality scene text segmentation annotations
Addresses over- and under-segmentation issues in character-level prompts
Enables dataset creation from real-world datasets without training
Abstract
The recent emergence of the Segment Anything Model (SAM) enables various domain-specific segmentation tasks to be tackled cost-effectively by using bounding boxes as prompts. However, in scene text segmentation, SAM can not achieve desirable performance. The word-level bounding box as prompts is too coarse for characters, while the character-level bounding box as prompts suffers from over-segmentation and under-segmentation issues. In this paper, we propose an automatic annotation pipeline named Char-SAM, that turns SAM into a low-cost segmentation annotator with a Character-level visual prompt. Specifically, leveraging some existing text detection datasets with word-level bounding box annotations, we first generate finer-grained character-level bounding box prompts using the Character Bounding-box Refinement CBR module. Next, we employ glyph information corresponding to text character…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Text and Document Classification Technologies
MethodsSegment Anything Model
