Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes
Alloy Das, Sanket Biswas, Umapada Pal, Josep Llad\'os

TL;DR
This paper introduces a domain-agnostic scene text spotting approach that generalizes across multiple noisy environments, featuring a new underwater text benchmark and an efficient transformer-based model that outperforms existing methods.
Contribution
Proposes a domain-agnostic training framework for scene text spotting, introduces the UWT benchmark for underwater scenes, and develops the DA-TextSpotter transformer model with superior performance.
Findings
DA-TextSpotter achieves comparable or better accuracy than existing models.
The UWT benchmark provides a new challenging dataset for underwater text detection.
The proposed method demonstrates strong generalization across multiple noisy domains.
Abstract
When used in a real-world noisy environment, the capacity to generalize to multiple domains is essential for any autonomous scene text spotting system. However, existing state-of-the-art methods employ pretraining and fine-tuning strategies on natural scene datasets, which do not exploit the feature interaction across other complex domains. In this work, we explore and investigate the problem of domain-agnostic scene text spotting, i.e., training a model on multi-domain source data such that it can directly generalize to target domains rather than being specialized for a specific domain or scenario. In this regard, we present the community a text spotting validation benchmark called Under-Water Text (UWT) for noisy underwater scenes to establish an important case study. Moreover, we also design an efficient super-resolution based end-to-end transformer baseline called DA-TextSpotter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Fluid Dynamics Simulations and Interactions
