CLIPSym: Delving into Symmetry Detection with CLIP
Tinghan Yang, Md Ashiqur Rahman, Raymond A. Yeh

TL;DR
CLIPSym leverages CLIP's vision-language capabilities with a novel symmetry detection model and prompting technique, achieving state-of-the-art results in geometric symmetry detection tasks.
Contribution
The paper introduces CLIPSym, a symmetry detection method that combines CLIP's pre-trained encoders with a rotation-equivariant decoder and a new prompting technique, SAPG.
Findings
Outperforms state-of-the-art on three datasets
Benefits from CLIP's pre-training and semantic prompts
Effective rotation and reflection symmetry detection
Abstract
Symmetry is one of the most fundamental geometric cues in computer vision, and detecting it has been an ongoing challenge. With the recent advances in vision-language models,~i.e., CLIP, we investigate whether a pre-trained CLIP model can aid symmetry detection by leveraging the additional symmetry cues found in the natural image descriptions. We propose CLIPSym, which leverages CLIP's image and language encoders and a rotation-equivariant decoder based on a hybrid of Transformer and -Convolution to detect rotation and reflection symmetries. To fully utilize CLIP's language encoder, we have developed a novel prompting technique called Semantic-Aware Prompt Grouping (SAPG), which aggregates a diverse set of frequent object-based prompts to better integrate the semantic cues for symmetry detection. Empirically, we show that CLIPSym outperforms the current state-of-the-art on three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhotosynthetic Processes and Mechanisms · Advanced Fluorescence Microscopy Techniques · Fractal and DNA sequence analysis
